REPORT  DOCUMENTATION  PAGE 

Form  Approved 

OMB  No.  0704-0188 

The  public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources, 
gathering  and  maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of 
information,  including  suggestions  for  reducing  the  burden,  to  Department  of  Defense,  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports  (0704-0188), 
1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington,  VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  any 
penalty  for  failing  to  comply  with  a  collection  of  information  if  it  does  not  display  a  currently  valid  0MB  control  number. 

PLEASE  DO  NOT  RETURN  YOUR  FORM  TO  THE  ABOVE  ADDRESS. 

1.  REPORT  DATE  (DD-MM-YYYY)  2.  REPORT  TYPE 

15-12-2005  Bi-annual  Performance/T  echnical  Report 

3.  DATES  COVERED  (From  -  To} 

06/01/2005-  11/30/2005 

4.  TITLE  AND  SUBTITLE 

Bi-annual  (6/2005—1 1/2005)  Performance/Technical  Report 
for  ONR  YIP  Award  under  Grant  N00014-03-1-0466 

Energy  Efficient  Wireless  Sensor  Networks  Using  Fuzzy  Logic 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

N00014-03  -1-0466 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

Liang,  Qilian 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

University  of  Texas  at  Arlington 

Office  of  Sponsored  Projects 

PO  Box  19145 

Arlington,  TX  76019 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

Office  of  Naval  Research 

800  North  Quincy  Street 

Arlington,  VA  22217-5660 

10.  SPONSOR/MONITOR'S  ACRONYM(S) 

ONR 

11.  SPONSOR/MONITOR'S  REPORT 

NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  Public  Release;  Distribution  is  Unlimited. 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

During  the  period  of  6/1/2005  —  1 1/30/2005,  we  have  performed  different  studies  on  wireless  sensor  networks.  1)  We  studied 
multi-target  detection  in  radar  sensor  networks.  2)  We  made  interference  analysis  and  performance  evaluation  on  UWB  Sensor 
Networks  in  hostile  environment.  3)  An  energy  consumption  and  latency  estimation  scheme  based  on  statistical  modeling  and 
maximal-likelihood  detection  was  investigated  for  wireless  sensor  networks.  4)  Self-organization  in  underwater  acoustic  sensor 
networks  was  studied.  5)  We  designed  a  cross-layer  optimization  scheme  for  mobile  ad  hoc  networks  using  fuzzy  logic  systems. 
6)  A  distributed  query  processing  algorithm  for  data-centric  sensor  networks  was  proposed.  7)  We  investigated  an  asynchronous 
energy-efficient  MAC  protocol  for  UWB  Sensor  Networks.  Ten  papers  were  produced  during  the  past  six  months,  and  are 
attached  to  this  report. 


15.  SUBJECT  TERMS 

Wireless  Sensor  Network,  Energy  Efficiency,  Fuzzy  Logic,  Radar,  UWB. 


|  16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 
ABSTRACT 

18.  NUMBER 
OF 

PAGES 

124 

19a.  NAME  OF  RESPONSIBLE  PERSON 

a.  REPORT 

b.  ABSTRACT 

c.  THIS  PAGE 

Qilian  Liang 

U 

U 

U 

uu 

19b.  TELEPHONE  NUMBER  (Include  area  code ) 

817-272-1339 

Standard  Form  298  (Rev.  8/98! 


Prescribed  by  ANSI  Std.  Z39.18 


DISTRIBUTION  STATEMENT  A 
Approved  for  Public  Release 
Distribution  Unlimited 


Bi-annual  (6/1/2005-11/30/2005)  Performance/Technical  Report 
for  ONR  YIP  Award  under  Grant  N00014-03-1-0466 
Energy  Efficient  Wireless  Sensor  Networks  Using  Fuzzy  Logic 


Qilian  Liang 

Department  of  Electrical  Engineering 
University  of  Texas  at  Arlington 
Arlington,  TX  76019-0016  USA 
Phone:  817-272-1339,  Fax:  817-272-2253 
E-mail:  liang@uta.edu 


Abstract 

During  the  period  of  6/1/2005  -  11/30/2005,  we  have  performed  different  studies  on  wireless 
sensor  networks. 

1.  We  studied  multi-target  detection  in  radar  sensor  networks. 

2.  We  made  interference  analysis  and  performance  evaluation  on  UWB  Sensor  Networks  in 
hostile  environment. 

3.  An  energy  consumption  and  latency  estimation  scheme  based  on  statistical  modeling  and 
maximal-likelihood  detection  was  investigated  for  wireless  sensor  networks. 

4.  Self-organization  in  underwater  acoustic  sensor  networks  was  studied. 

5.  We  designed  a  cross-layer  optimization  scheme  for  mobile  ad  hoc  networks  using  fuzzy 
logic  systems. 

6.  A  distributed  query  processing  algorithm  for  data-centric  sensor  networks  was  proposed. 

7.  We  investigated  an  asynchronous  energy-efficient  MAC  protocol  for  UWB  Sensor  Networks. 
Ten  papers  were  produced  during  the  past  six  months,  and  are  attached  to  this  report. 


1  Multi- Target  Detection  in  Radar  Sensor  Networks 

Radar  as  a  powerful  sensor  system  has  been  employed  for  the  detection  and  location  of  reflecting 
objects  such  as  aircraft,  ships,  vehicles,  people  and  natural  environment.  By  radiating  energy 
into  space  and  detecting  the  echo  signal  reflected  from  an  object  or  target,  the  radar  system  can 
determine  the  presence  of  a  target.  Furthermore,  by  comparing  the  received  echo  signal  with  the 
transmitted  signal,  the  location  of  a  target  can  be  determined  along  with  other  target  information. 

Conventional  radar  system  operates  as  independent  entity.  While  in  a  resource-constrained 
wireless  sensor  network,  such  detached  operation  may  lead  to  deteriorated  performance  and  waste 
of  limited  resources.  Cooperative  techniques  such  as  joint  coding  and  joint  detection  appear  to  be 
very  promising  in  optimizing  system  performance  under  constrained  resources.  In  [1],  we  studied 
data  fusion  in  a  multi-target  radar  sensor  network. 

A  lot  prior  research  in  data  fusion  are  based  on  the  assumption  of  lossless  communication,  i.e., 
the  information  sent  from  local  sensors  is  perfectly  recovered  at  the  fusion  center.  Other  researchers 
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addressed  the  problem  of  distributed  detection  with  constrained  system  resources,  most  of  which 
provided  the  solutions  to  optimize  sensor  selection.  In  another  hand,  decision  fusion  with  non-ideal 
communication  channels  is  studied  at  both  fusion  center  level  and  at  the  sensor  level.  Channel- 
aware  decision  fusion  rules  have  later  been  developed  using  a  canonical  distributed  detection  system 
where  binary  decisions  from  multiple  parallel  sensors  are  transmitted  through  fading  channels  to 
a  fusion  center.  Lin  extended  the  channel  aware  decision  fusion  rules  to  multi-hop  WSNs.  The 
above  results,  however,  are  mostly  obtained  based  on  one  target  or  one  event  detection  which  is 
not  applicable  to  multi-target  situations.  Furthermore,  in  a  radar  sensor  system,  when  clutter,  the 
unwanted  echoes  from  the  natural  environment  is  much  larger  than  receiver  noise,  detection  can 
be  quite  different  from  that  when  the  noise  is  dominant. 

In  [1] ,  we  presented  the  theoretical  formulation  of  decision  fusion  problem  for  multi-target  case. 
The  objective  of  this  work  is  to  extend  the  channel-aware  decision  fusion  rules  developed  to  multi¬ 
target  radar  sensor  system.  We  made  the  assumption  that  the  multiple  targets  are  stationary 
targets  in  clutter.  We  used  Rayleigh  target  fluctuation  model  and  Gaussian  clutter  as  our  first 
stage  study.  Particularly,  we  assume  the  radar  when  receiving,  is  a  constant  false  alarm  receiver 
(CFAR).  CFAR  automatically  raises  the  threshold  level  to  keep  clutter  echoes  and  external  noise 
from  overloading,  which  performs  as  a  good  rejection  of  clutter. 

2  UWB  Sensor  Networks  in  Hostile  Environment 

Since  2002  there  has  been  great  increasing  popularity  of  commercial  applications  based  on  Ultra 
WideBand.  This  has  ignited  interest  in  the  use  of  this  technology  for  sensor  networks.  Actually, 
UWB  systems  have  potentially  low  complexity  and  low  cost;  have  a  very  good  time  domain  reso¬ 
lution,  which  facilitates  location  and  tracking  applications.  So,  UWB  wireless  sensor  networks  are 
promising. 

One  of  the  most  important  applications  of  WSN  is  in  battle  field,  which  means  there  exist  hostile 
interferences.  Frequency  Hopping  (FH)  technology  offers  an  improvement  in  performance  when 
the  communication  systems  is  attacked  by  hostile  interference  and  reduce  the  ability  of  a  hostile 
observer  to  receive  and  demodulate  the  communication  signal.  This  kind  of  inherent  property  finds 
it  a  potential  position  in  the  UWB  sensor  networks.  Based  on  the  UWB  definition  released  by  the 
FCC  (FCC,  2002)  that  a  signal  is  UWB  if  its  bandwidth  exceeds  500  MHz,  the  overall  7.5  GHz 
bandwidth,  that  is,  frequencies  in  the  range  3.1  GHz  to  10.6  GHz  as  based  on  the  FCC  ruling,  can 
be  split  into  smaller  frequency  bands  of  at  least  500  MHz  each.  This  character  inspired  us  to  design 
a  hybrid  FH/TH-PPM  UWB  system  in  [2], 

In  [2],  we  studied  the  performance  of  a  FH/TH  UWB  sensor  network  with  hostile  partial-band 
(PB)  tone  interference  and  multi-user  interferences.  Interferences  due  to  the  hostile  environment 
and  the  Multi-User  Access  are  critical  factors  affecting  performance  of  the  Wireless  Sensor  Networks. 
There  is  clearly  a  need  of  a  system  that  can  survive  from  the  severe  interference.  In  this  paper,  an 
analysis  is  also  made  for  precisely  calculating  the  bit  error  rates  in  the  presense  of  multitone/pulse 
(tone  in  frequency  domain  and  pulse  in  time  domain)  interference  and  Multi-User  Interference. 

3  Energy  Consumption  and  Latency  Estimation  Based  on  Statis¬ 
tical  Modeling  and  ML  Detection 

In  [3],  we  modeled  the  end-to-end  distance  for  given  hops  in  Wireless  Sensor  Networks.  We  derived 
that  the  single-hop  distance  follows  the  distribution  2r/R2,  where  R  is  the  transmission  range.  The 
end-to-end  distance  shows  beta  distribution  for  two  hops,  and  approaches  Gaussian  distribution 
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when  the  number  of  hops  is  beyond  three.  As  an  application  example,  we  proposed  Statistical 
Distance  Estimation,  which  shows  less  distance  error  than  Hop-TERRAIN  and  APS  (Ad  hoc  Posi¬ 
tioning  System).  Our  results  are  also  applicable  to  other  applications  for  Wireless  Sensor  Networks. 

Based  on  this  theoretical  observation,  we  applied  it  to  energy  consumption  and  latency  esti¬ 
mation  based  on  the  number  of  hops  prediction [4]  [5].  The  potential  applications  of  WSN,  such  as 
environment  monitor,  often  emphasize  the  importance  of  location  information.  Accordingly  geo¬ 
graphic  routing  was  proposed  to  handle  such  requirement.  Most  likely,  a  packet  is  not  routed  to  a 
specific  node,  but  a  given  location.  An  interesting  question  arises  as  “how  many  hops  does  it  take 
to  reach  a  given  location?”  The  prediction  of  the  number  of  hops  is  important  not  only  in  itself  but 
also  in  helping  estimating  the  latency  and  energy  cost,  which  are  both  important  to  the  viability 
of  WSN. 

The  question  could  become  very  simple  if  the  sensor  nodes  are  manually  placed.  However,  if 
sensor  nodes  are  deployed  in  a  random  fashion,  which  is  the  case  for  most  potential  application,  the 
answer  is  beyond  the  reach  of  simple  geometry.  The  stochastic  nature  of  the  random  deployment 
calls  for  a  statistical  study.  A  natural  and  obvious  estimation  would  be  dividing  the  distance  by 
the  average  inter-node  distance  (i.e.,  the  average  single-hop  distance).  However,  such  estimation 
may  be  unable  to  provide  the  required  accuracy.  We  propose  making  a  Maximum  Likelihood  (ML) 
decision, 

H  =  &rgf(H\r),  JT  =  1, 2, 3,  •  •  •  .  (1) 

max 

Considering 

f(«\r)  =  (2) 

the  decision  rule  can  be  translated  into 


H  =  arg 

max 


(3) 


where  f(H,  r )  is  also  called  objective  function. 
Generally,  for  H  =  n,  we  have 
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Theoretically,  we  can  take  derivative  of  (5)  with  respect  to  r  to  obtain  the  objective  function, 
use  (3)  to  decide  the  most  likely  H  given  r  and  give  the  probability  of  error  for  such  a  decision. 
However,  (5)  is  awkward  to  evaluate  and  the  computational  cost  could  limit  the  applicability  of  such 
a  decision  scheme.  Therefore,  we  propose  Attenuated  Gaussian  Approximation  for  the  joint  pdf 
based  on  the  histogram  collected  from  the  simulations.  The  skewness  and  kurtosis  tests  show  a  good 
fit  between  our  approximation  and  the  simulation  data.  Also,  we  found  the  following  properties. 
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1.  <rn  ss  crn-i,  which  means  the  neighboring  joint  pdf’s  have  similar  spread. 

2.  mn  —  mn- 1  ps  m„+i  —  mn,  which  means  the  joint  pdf’s  are  evenly  spaced. 

3 .3  <  mn~7hn~1  <  5,  which  means  the  overlap  between  the  neighboring  joint  pdf’s  is  small  but 
not  negligible.  (As  a  rule  of  thumbs,  Q( 3)  is  considered  relatively  small  and  Q( 5)  is  regarded 
negligible.) 

4.  — -m-— -  »  5,  which  means  the  overlap  between  the  non-neighboring  joint  pdf’s  is  negligible. 

(7  n 

5.  a  <  1.  For  large  density  A,  a.  — >  1.  Along  with  Property  1,  this  tell  us  that  the  neighboring 
joint  pdf’s  have  nearly  identical  shape. 

These  properties  visibly  simplify  the  decision  rule  and  error  analysis. 

we  decide  H  =  n  if  dfi~\  <  r  <  dn,  (6) 


where  dn  is  the  decision  boundary  given  by 


d 


n 


ojmn+ 1  +  al+1mn 
+  o-n+i 


(7) 


And  the  probability  of  error  is 

oo 

pW  « Q( ™--^)  +  -”1")]  (8) 

2(72  o 

n=o 

Based  on  this  result,  for  latency  estimation,  a  good  estimator  of  the  total  latency  of  a  Z-bit  message 
is 

l[Ttx  +  (n  ~  1  )(Ttx  +  Trx)  +  Trx]  —  lh(Ttx  +  Trx)  (9) 

And  for  energy  cost,  we  have 

n 

Etotai{lir)  =  2 nlEeiec  -f-  2e/sA^  (10) 

l 

where  Eeiec  is  the  unit  energy  consumed  by  the  electronics  to  process  one  bit  of  message,  is  the 
amplifier  factor  for  free-space  path  loss. 

4  Self-Organization  for  Underwater  Acoustic  Sensor  Networks 

In  [6],  we  are  concerned  with  the  optimal  cluster  size  in  underwater  acoustic  sensor  networks.  An 
UnderWater  Acoustic  Sensor  Network  (UW-ASN)  can  be  thought  of  as  an  ad  hoc  network  consist¬ 
ing  of  sensors  linked  by  an  acoustic  medium  to  perform  distributed  sensing  tasks.  To  achieve  this 
objective,  sensors  must  self-organize  into  an  autonomous  network  which  can  adapt  to  the  charac¬ 
teristics  of  the  underwater  environment.  UW-ASNs  share  many  communication  technologies  with 
traditional  ad  hoc  networks  and  terrestrial  wireless  sensor  networks,  but  there  are  some  vital  differ¬ 
ences  such  as  limited  energy  and  bandwidth  constraint,  thus  the  protocols  developed  for  traditional 
wireless  ad  hoc  networks  are  not  necessarily  well  suited  to  the  unique  features  of  WSNs.  When  a 
wireless  sensor  may  have  to  operate  for  a  relatively  long  duration  on  a  tiny  battery,  energy  efficiency 
becomes  a  major  concern.  Another  issue  in  shallow  water  communications  is  that  due  to  the  limit 
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of  bandwidth  in  shallow  water  communications,  multi-hop  communication  could  introduce  heavy 
interference  between  cluster  members,  therefore,  each  sensor  in  a  cluster  communicate  directly  to 
its  cluster  head  and  intra-cluster  communication  should  be  coordinated  by  the  cluster  head  in  or¬ 
der  to  maximize  the  bandwidth  usage.  We  showed  that  the  optimal  cluster  size  is  also  relevant  to 
the  working  frequency  of  the  acoustic  transmission.  Furthermore,  we  showed  that  assigning  work¬ 
ing  frequency  to  cluster  members  according  to  their  distances  to  the  cluster  head  could  minimize 
the  energy  consumption.  Clustering  has  been  widely  used  in  pattern  recognition,  and  we  use  it 
to  obtain  the  energy-efficient  organization  for  UW-ASN.  Consider  a  heterogeneous  UW-ASN,  in 
which  the  low-capacity  sensors  serves  as  cluster  members  and  are  randomly  distributed,  and  the 
high-capacity  sensors  serve  as  cluster  heads  and  are  manually  positioned.  If  we  obtain  the  optimal 
cluster  size,  then  the  required  number  of  high-capacity  sensors  and  their  ideal  positions  can  also 
be  determined.  If  we  assume  the  high-capacity  sensors  have  virtually  unlimited  energy  reserve 
compared  to  the  low-capacity  ones,  only  the  energy  consumption  of  the  low-capacity  sensors  need 
to  be  counted.  Under  such  circumstances,  we  derived  the  optimum  cluster  size,  and  we  observed 
that  the  frequency  allocation  could  be  designed  to  minimize  the  energy  consumption.  Although 
the  optimum  frequency  allocation  is  still  elusive,  we  proposed  an  objective  function,  which  can  be 
used  to  seek  such  a  frequency  allocation  algorithm. 

5  Cross- Layer  Design  in  Mobile  Ad  Hoc  Networks 

The  demand  for  Quality  of  Service  (QoS)  in  mobile  ad  hoc  networks  is  growing  in  a  rapid  speed. 
To  enhance  the  QoS,  in  [7]  [8],  we  consider  red  the  combination  of  physical  layer  and  data-link 
layer  together,  a  cross-layer  approach.  We  proposed  to  use  Fuzzy  Logic  System  (FLS)  for  packet 
transmission  delay  analysis  and  prediction.  We  applied  both  a  singleton  type-1  FLS  and  an  interval 
type-2  FLS  for  the  analysis  and  prediction.  Theoretical  analysis  and  simulation  data  demonstrate 
that  a  type-2  fuzzy  membership  functions  (MFs),  i.e.,  the  Gaussian  MFs  with  uncertain  variance  is 
most  appropriate  to  model  Bit  Error  Rate  (BER).  Recent  research  and  simulation  data  discovered 
that  the  lognormal  distribution  could  match  for  the  MAC  layer  service  time.  So  we  could  also  use 
the  Gaussian  MFs  to  mode  the  logarithm  of  MAC  layer  service  time.  We  used  Guassian  membership 
functions  (MFs)  to  represent  the  antecedents  and  the  consequent  and  two  FLSs:  a  singleton  type-1 
FLS  and  an  interval  type-2  FLS  are  designed  to  predict  the  packet  transmission  delay  based  on  the 
BER  and  MAC  layer  service  time.  After  that,  we  could  adjust  the  transmission  power  according 
to  the  predicted  packet  transmission  delay.  Therefore  average  delay,  energy  consumption  and 
throughput  performances  will  change.  We  implemented  the  simulation  model  using  the  OPNET 
modeler.  For  type-1  FLS,  We  chose  Gaussian  membership  function  as  antecedents;  for  interval  type- 
2  FLS,  we  used  Gaussian  primary  MF’s  with  fixed  mean  and  uncertain  STD  for  the  antecedents. 
The  steepest  decent  algorithm  was  used  to  train  all  the  parameters  based  on  the  300  data  sets.  After 
training,  the  rules  were  fixed,  and  we  tested  the  FLS  based  on  the  remaining  300  data  sets.  We 
summarized  the  root-mean-square-errors  (RMSE)  between  the  estimated  packet  transmission  delay 
and  the  actual  delay.  Simulation  result  showed  that  the  interval  type-2  FLS  performs  better  than 
the  type-1  FLS.  And  we  used  the  outcomes  of  FLS  predictors  to  control  the  transmission  powers. 
We  assume  we  could  know  the  actual  transmission  delay  as  ideal  algorithm  before  we  adjusted  the 
transmission  power.  So  we  could  use  he  simulation  result  to  compare  the  performances  of  three 
algorithms  (type-2  FLS,  type-1  FLS,  and  ideal  case).  For  average  delay  prediction,  a  type-2  FLS  is 
better  than  the  type-1  FLS,  and  the  idea  case  is  the  best  among  the  three.  For  energy  efficiency,  the 
type-2  FLS  is  better  than  the  type-1  FLS,  and  the  idea  case  is  the  lower  bound.  For  throughput, 
the  type-2  FLS  is  better  than  the  type-1  FLS,  and  the  idea  case  was  set  as  the  upper  bound. 
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6  Energy  Efficient  Query  Processing  in  Data-Centric  Wireless 
Sensor  Networks 

The  widespread  deployment  of  sensor  nodes  is  transforming  the  physical  world  into  a  computing 
platform.  Sensor  nodes  not  only  respond  to  physical  signals  to  produce  data,  they  also  embed 
computing  and  communication  capabilities.  They  are  thus  able  to  store,  process  locally  and  transfer 
the  data  they  produce.  Prom  a  data  storage  point  of  view,  wireless  sensor  network  (WSN)  can  be 
regarded  as  a  kind  of  database,  distributed  sensor  database  system  (DSDS).  DSDS,  compared 
to  traditional  database  systems,  stores  data  within  the  network  and  allow  queries  to  be  injected 
anywhere  through  query  processing  operators  in  the  network.  Even  though  data  query  processing 
methods  have  been  studied  extensively  in  traditional  database  systems.  Few  of  them  can  be  directly 
applied  into  sensor  database  systems  due  to  the  characteristics  of  sensor  networks:  decentralized 
nature  of  sensor  networks,  limited  computational  power,  imperfect  information  recorded,  and  energy 
scarcity  of  individual  sensor  nodes. 

The  goal  of  monitoring  through  sensor  nodes  is  to  infer  information  about  objects  from  mea¬ 
surements  made  from  remote  locations.  Since  inference  processes  are  always  less  than  perfect, 
there  is  an  element  of  uncertainty  regarding  the  answers.  When  viewed  from  this  perspective,  the 
problem  of  uncertainty,  which  stands  for  the  quality  of  query  answers,  is  central  to  monitoring 
applications.  Thus,  to  build  useful  information  systems,  it  is  necessary  to  learn  how  to  represent 
and  reason  with  imperfect  information.  Considerring  quality  requirement  and  power  constraint, 
we  made  analysis  and  classification  on  sources  of  imperfect  information  and  energy  waste  for  an 
environmental  temperature  monitoring  application  in  [9].  In  the  context  of  our  analyzing  and  un¬ 
derstanding  of  query  answer  uncertainty,  we  utilized  image  chain  method  to  express  the  nature 
and  the  source  of  uncertainty  on  temperature  information  derived  from  remote  sensing.  There 
are  three  main  sources  of  imperfect  information:  measurement  quality  of  nodes,  which  introduces 
uncertainty  and  imprecise  information  into  query  answers,  point  spread  function  of  nodes,  which 
introduces  ambiguity  into  query  answers,  and  link  quality,  which  introduces  incompleteness  into 
query  answers.  Fixing  other  conditions,  such  as  node  density,  communication  range,  sensing  range 
and  network  coverage,  we  change  those  imperfect  information  sources  separately  to  check  the  in¬ 
fluences  of  those  imperfect  information  sources  on  the  correctness  of  query  answers.  Simulation 
results  showed  that  with  measurement  errors,  misrepresent  errors,  or  missing  information  increased, 
the  errors  included  in  query  answers  are  obviously  increased  and  therefore  the  confidence  of  query 
answers  is  reduced.  In  energy  waste  source,  we  considerred  that  within  a  network,  not  all  available 
nodes  provide  useful  information  that  improves  the  accuracy  of  final  results.  Furthermore,  some 
information  might  be  redundant  because  nodes  close  to  each  other  would  have  similar  data.  From 
this  prospect,  collecting  raw  readings  from  all  nodes  to  front-end  nodes  involves  large  amounts  of 
raw  readings,  which  will  lead  to  shorter  lifetime,  especially  for  energy-limited  WSNs. 

In  additions,  we  proposed  a  quality-guaranteed  and  energy-efficient  algorithm  (QGEE)  for  sen¬ 
sor  database  systems  in  [9].  We  employed  an  in-network  query  processing  method  to  task  sensor 
networks  through  declarative  queries.  In  query  answer  confidence  control,  we  modeled  the  problem¬ 
determining  optimal  locations  for  a  query,  as  a  k-partial  set  cover  problem  and  adaptively  determine 
the  value  of  the  radius  of  disks  according  to  users’  quality  requirements  instead  of  fixing  it  when 
considering  the  influence  of  PSF  of  nodes  on  uncertainty  of  query  answers.  We  formed  our  query 
vector  space  model  (QVSM)  to  express  the  correlation  between  a  query  and  all  candidate  nodes. 
We  chose  location,  measurement  quality  and  remaining  battery  capacity  of  nodes  as  the  elements 
of  QVSM.  The  decision-which  nodes  are  active  to  respond  queries-is  based  on  their  query  corre¬ 
lation.  That  is,  nodes  with  highest  query  correlation  among  their  one-hop  neighbors  are  chosen 
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to  participate  in  related  query  processing.  In  QGEE,  active  nodes  are  chosen  locally  leveraging 
cooperation  among  nodes.  Besides  those,  we  also  controlled  the  sample  size  to  ensure  the  sam¬ 
pling  distribution  of  estimators  meet  users’  pre-specified  target  precision,  and  utilized  a  multipath, 
power-aware  and  mobility  aware  routing  scheme  to  control  information  collection.  Bases  on  these 
strategies,  our  QGEE  can  adaptively  form  an  optimal  query  plan  in  terms  of  energy  efficiency  and 
query  quality.  That  is,  only  a  subset  of  nodes  within  a  network  will  be  chosen  to  acquire  readings 
or  samples  corresponding  to  the  fields  or  attributes  referenced  in  queries.  The  goal  of  our  approach 
is  to  reduce  interference  coming  from  measurements  with  extreme  errors  and  to  minimize  energy 
consumption  by  providing  service  that  is  considerably  necessary  and  sufficient  for  the  need  of  ap¬ 
plications.  Moreover,  we  employed  probabilistic  method  to  formulate  the  distribution  of  imperfect 
information  sources  in  terms  of  probability  distribution  function  (PDF).  Since  a  statistic  measure¬ 
ment  on  samples  can  rarely,  if  ever,  be  expected  to  be  exactly  equal  to  a  parameter,  it  is  important 
that  a  statement  describing  the  precision  accompanies  estimation.  We  utilized  confidence  intervals 
to  state  both  how  close  the  value  of  a  statistic  being  likely  to  be  value  of  a  parameter  and  the 
chance  of  being  close.  Hence,  using  our  QGEE  scheme,  probabilistic  query  answers  can  be  acquired 
on  uncertain  data.  The  probabilities  to  an  answer  allow  users  to  place  appropriate  confidence 
in  it.  The  simulation  results  demonstrated  that,  compared  with  the  query  processing  algorithm 
which  has  no  query  optimization,  our  algorithm  can  reduce  resource  usage  about  50%  on  processing 
same  number  of  queries,  the  frame  loss  rate  about  20%.  The  simulation  results  for  MAXIMUM, 
MINIMUM  and  AVERAGE  aggregation  operation  showed  that  our  QGEE  can  successfully  obtain 
suitable  confidence  intervals  to  guarantee  the  true  value  of  query  answers  locating  within  this  in¬ 
terval  with  a  probability,  which  is  equal  to  or  larger  than  the  pre-specified  probability  by  users 
according  to  various  query  answer  confidence  requirement. 

7  Energy  Efficient  Asynchronous  MAC  protocol  for  UWB  Sys¬ 
tems 

Ultra  wideband  (UWB)  technology  offers  unique  advantages  for  wireless  communications:  precise 
location-timing  capabilities,  low  power,  low  complexity,  and  low  cost.  However,  no  existing  wireless 
network  successfully  takes  advantage  of  the  properties  of  this  technology  because  of  the  lack  of  an 
efficient  medium  access  control  (MAC)  technology.  Multi-antenna  systems  have  been  studied  inten¬ 
sively  in  recent  years  due  to  their  potential  to  dramatically  increase  the  channel  capacity  in  fading 
channels.  It  has  been  shown  that  multi-input-multi-output  (MIMO)  systems  can  support  higher 
data  rates  under  the  same  transmit  power  budget  and  bit-error-rate  performance  requirements  as  a 
single-input  single-output  (SISO)  system.  However,  direct  application  of  multi-antenna  techniques 
to  sensor  node  impractical  due  to  the  limited  physical  size  of  a  sensor  node,  which  typically  can  only 
support  a  single  antenna.  In  recent  years,  virtual  MIMO  conception  have  been  proposed,  which 
allows  individual  single-antenna  nodes  to  cooperate  on  information  transmission  and/or  reception. 
A  cooperative  MIMO  system  can  be  constructed  such  that  energy-efficient  MIMO  schemes  can  be 
deployed. 

In  [10],  we  proposed  an  energy-efficient  MAC  protocol:  asynchronous  MAC  protocol  for  UWB 
communications  (A-MAC-UWB).  From  energy  efficiency  aspect,  since,  for  UWB  communication, 
circuit  energy  consumption  is  comparable  to  or  even  dominates  the  transmission  energy  since  UWB 
is  low  power  consumption  i.e.,  a  bit  rate  of  lOOKpbs  over  5  meters  with  no  more  than  lmW  power 
consumption,  we  utilize  virtual  MIMO  technology  to  reduce  the  idle  time  for  waiting  for  send¬ 
ing/receiving  next  symbol.  Virtual  MIMO  strategy  can  also  increase  the  data  rate,  and  substitute 
space  diversity  for  time  diversity  to  improve  system  performance.  Besides  this,  we  also  exploit 
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multiple  working  models:  sleep  and  active  models.  The  idle  time  for  waiting  to  transmit/receive 
next  data  packet  is  reduced  through  enforcing  nodes  into  sleep  mode  to  archive  energy  reservation. 
But  the  latency  of  data  packets  is  traded  off. 

In  addition,  since  UWB  can  support  multiple  access,  our  A-MAC-UWB  protocol  does  not  use 
mutual  exclusion  (as  is  commonly  done  by  random  access  or  TDMA  protocols)  but,  in  contrast, 
allows  interference  to  occur  and  adapt  to  it.  That  is,  competing  sources  are  allowed  to  send 
concurrently,  causing  rate  reduction  instead  of  collisions.  Slot  ALOHA  scheme  is  used  in  A-MAC- 
UWB.  One  of  the  advantages  for  our  algorithm  is  removing  the  overhead  of  control  packets  for 
carrier  sensing  to  avoid  collision,  such  as  RTS/CTS  for  CSMA/CA  scheme,  but  also  ensuring 
successful  transmission.  For  multiuser  interference,  we  set  a  model  to  adaptively  adjust  the  data 
rate  to  ensure  certain  SNR  at  receiver  side,  since  a  Shanon  capacity  of  a  multipath  fading  additive 
white  Gaussian  noise  (AWGN)  wideband  channel  is  a  linear  function  of  SNR.  We  formulate  the 
relationship  between  probability  of  bit  error  and  signal  to  noise  and  multiuser  interference  ratio 
(SNIR). 

For  optimum  design  for  power  on/off  phase  duration,  we  considerred  the  traffic  whose  arrival 
interval  follows  heavy  tailed  distribution,  instead  of  Poisson  distribution.  Based  on  that,  we  ac¬ 
quired  the  probability  density  function  (pdf)  for  power  off  phase  duration  for  our  algorithm.  We 
also  set  up  a  objective  function  to  carry  out  not  only  extend  the  power  off  duration  as  long  as 
possible,  but  also  ensure  as  less  as  possible  chance  for  buffer  overflowing  for  nodes  within  a  cluster. 
For  power  on  duration  design,  we  also  set  up  an  object  function  to  choose  the  highest  date  rate, 
which  can  ensure  the  BER  acquired  at  receiver  side  to  satisfy  with  the  requirement.  Compared 
with  our  previous  work,  we  tried  to  find  a  better  method  to  trade  off  between  data  packet  latency 
and  energy  reservation. 
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Abstract 

In  this  paper,  we  consider  the  decision  fusion  of  Rayleigh  fluctuating  targets  in  multi-radar 
sensor  networks.  Decision  fusion  and  data  fusion  in  Wireless  Sensor  Networks  (WSNs)  has 
been  widely  studied  in  order  to  save  energy.  Radar  system  as  a  special  sensor  network,  when 
implemented  for  battlefield  surveillance,  faces  bandwidth  constraint  in  real-time  applications 
instead  of  energy  restriction.  A  reliable  detection  of  multiple  targets  in  clutter  is  perhaps  the 
most  important  objective  in  such  an  echo-location  system.  In  this  work,  we  study  the  decision 
fusion  rules  of  multiple  fluctuating  targets  in  multi-radar  (MT-MR)  sensor  networks.  The  MT- 
MR  decision  fusion  problem  is  modeled  as  a  multi-input  multi-output  (MIMO)  system.  We 
assume  that  each  radar  makes  binary  decision  for  each  target  from  the  observation,  i.e.  if  the 
target  is  present  or  not.  We  derive  our  MIMO  fusion  rules  based  on  the  target  fluctuation 
model  and  compare  against  the  optimal  likelihood  ratio  method  (LR),  maximum  ratio  combiner 
(MRC)  and  equal  gain  combiner  (EGC).  Simulation  results  show  that  the  MIMO  fusion  rules 
approach  the  optimal- LR  and  outperforms  MRC  and  EGC  at  high  signal  to  clutter  ratio  (SCR). 

Index  Terms  :  wireless  sensor  networks,  radar,  target  fluctuation,  clutter,  MIMO, 
data  fusion,  Rayleigh,  optimal  likelihood,  maximum  ratio  combiner,  equal  gain  com¬ 
biner 
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1  Introduction 


Wireless  sensor  networks  (WSN)  have  attracted  growing  interest  in  various  applications,  especially 
in  the  area  of  battlefield  surveillance,  health  care  and  telemedicine,  environmental  and  habitat 
monitoring.  Radar  as  a  powerful  sensor  system,  has  been  employed  for  the  detection  and  location 
of  reflecting  objects  such  as  aircraft,  ships,  vehicles,  people  and  natural  environment.  By  radiating 
energy  into  space  and  detecting  the  echo  signal  reflected  from  an  object  or  target,  the  radar  system 
can  determine  the  presence  of  a  target.  Furthermore,  by  comparing  the  received  echo  signal  with 
the  transmitted  signal,  the  location  of  a  target  can  be  determined  along  with  other  target  related 
information  [1]. 

Conventional  radar  system  operates  as  a  pure  independent  entity.  While  in  a  resource-constrained 
WSN,  such  detached  operation  may  lead  to  deteriorated  performance  and  waste  of  limited  resources. 
Collaborative  signal  and  information  processing  over  the  network  is  a  very  promising  area  of  research 
and  is  related  to  distributed  information  fusion  [2].  Important  technical  issues  include  the  degree 
of  information  sharing  between  sensors  and  how  sensors  fuse  the  information  from  other  sensors. 
Processing  data  from  more  sensors  generally  results  in  better  performance  but  also  requires  more 
communication  resources.  Similarly,  less  information  is  lost  when  communicating  information  at  a 
low  level  (e.g.,  raw  data),  but  requires  more  bandwidth.  Therefore,  it  is  a  tradeoff  between  system 
performance  and  resource  utilization  in  collaborative  information  processing  and  data  fusion. 

A  lot  of  prior  research  in  data  fusion  are  based  on  the  assumption  of  lossless  communication,  i.e., 
the  information  sent  from  local  sensors  is  perfectly  recovered  at  the  fusion  center.  For  example,  in 
[3]  and  [4] ,  Vashney  et.  al  investigated  the  optimum  fusion  rules  under  the  conditional  independence 
assumption.  Other  papers  [5,  6]  addressed  the  problem  of  distributed  detection  with  constrained 
system  resources,  most  of  which  provided  the  solutions  to  optimize  sensor  selection.  However, 
this  lossless  communication  assumption  is  not  practical  for  many  WSNs  where  the  transmitted 
data  suffers  from  channel  fading  and  multi-user  interference.  In  another  hand,  decision  fusion  with 
non-ideal  communication  channels  are  studied  at  both  fusion  center  level  [7,  8]  and  at  the  sensor 
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level  [9,  10].  In  [8],  Thomopoulos  and  Zhang  derived  the  optimal  thresholds  by  assuming  a  simple 
binary  symmetric  channel  between  sensors  and  the  fusion  center.  Their  method  is  quite  simple 
but  requires  global  knowledge  of  the  entire  system.  In  [7],  channel-aware  decision  fusion  rules  have 
been  developed  using  a  canonical  distributed  detection  system  where  binary  decisions  from  multiple 
parallel  sensors  are  transmitted  through  fading  channels  to  a  fusion  center.  Later,  Lin  et.  al  [11] 
have  extended  the  channel  aware  decision  fusion  rules  to  more  realistic  WSN  models  that  involve 
multi-hop  transmissions.  The  above  results,  however,  are  mostly  obtained  based  on  one  target  or 
one  event  detection  which  is  not  applicable  to  multi-target  situations.  Furthermore,  in  a  radar 
sensor  system,  when  clutter,  the  unwanted  echoes  from  the  natural  environment  is  much  larger 
than  the  receiver  noise,  detection  can  be  quite  different  from  that  when  the  noise  is  dominant. 

The  objective  of  this  work  is  to  derive  the  decision  fusion  rules  of  multiple  fluctuating  targets 
in  multi-radar  (MT-MR)  sensor  networks.  We  focus  on  the  detection  decision  performance  of 
fused  data  with  the  existence  of  clutter.  The  MT-MR  decision  fusion  is  modeled  as  a  multi-input 
multi-output  (MIMO)  system.  We  present  the  theoretical  formulation  of  the  MIMO  decision  fusion 
problems.  We  make  the  assumption  that  the  multiple  targets  are  stationary  targets  embedded  in 
clutter.  Rayleigh  target  fluctuation  model  and  Gaussian  clutter  are  used  in  our  first  stage  study. 
Particularly,  we  assume  that  the  radar  in  our  scenario,  is  a  constant  false  alarm  receiver  (CFAR) 
when  receiving.  CFAR  automatically  raises  the  threshold  level  to  keep  clutter  echoes  and  external 
noise  from  overloading,  which  performs  as  a  good  rejection  of  clutter. 

The  remainder  of  this  paper  is  organized  as  follows.  In  the  next  section,  we  introduce  the 
concept  of  clutter  and  target  fluctuation  model  in  radar  sensor  system.  In  Section  3,  we  briefly 
overview  the  previous  work  on  fusion  rules  designed  for  a  canonical  parallel  distributed  detection 
system  with  single  hop  transmission  between  sensor  nodes  and  fusion  center.  In  Section  4,  we 
present  our  MIMO  decision  fusion  model  for  multi-target  multi-radar  sensor  networks.  Simulation 
and  performance  analysis  are  presented  in  Section  5.  Section  6  concludes  this  paper. 
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2  Target  Detection  in  Radar  Sensor  System 


Target  detection  and 
information  extraction 


Figure  1:  Basic  Principle  of  Radar  System 

The  basic  principle  of  radar  [1]  is  illustrated  in  Fig.  1.  An  electromagnetic  signal  is  generated 
by  the  transmitter  and  is  radiated  into  space  by  antenna.  A  portion  of  the  transmitted  energy 
is  intercepted  by  the  target  and  reradiated  in  various  directions.  The  reradiation  directed  back 
towards  the  radar  is  collected  by  the  radar  antenna,  which  delivers  it  to  a  receiver.  There  it  is 
processed  to  detect  the  presence  of  the  target  and  determine  its  location.  A  single  antenna  is 
usually  used  on  a  time-shared  basis  for  both  transmitting  and  receiving  when  the  radar  waveform 
is  a  repetitive  series  of  pulses.  The  range,  or  distance,  to  a  target  is  found  by  measuring  the  time  it 
takes  for  the  radar  signal  to  travel  to  the  target  and  return  back  to  the  radar.  The  target’s  location 
in  angle  can  be  found  from  the  direction  the  narrow-beamwidth  radar  antenna  points  when  the 
received  echo  signal  of  maximum  amplitude.  If  the  target  is  in  motion,  there  is  a  shift  in  the 
frequency  of  the  echo  signal  due  to  the  doppler  effect.  This  frequency  shift  is  proportional  to  the 
velocity  of  the  target  relative  to  the  radar.  The  doppler  frequency  shift  is  widely  used  in  radar  as 
the  basis  for  separating  desired  moving  targets  from  fixed  clutter  echoes  reflected  from  the  natural 
environment  such  as  land,  sea  or  rain.  Radar  can  also  provide  information  about  the  nature  of  the 
target  being  observed. 

In  active  radar  sensor  networks,  the  received  data  usually  consists  of  three  parts:  white  thermal 
noise,  clutter  scattered  by  the  land  environment,  and  if  a  target  is  present,  a  reflected  or  reradiated 
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version  of  the  transmitted  signal  [12].  That  is,  we  have 


y{t)  _  a{t)s{t)  _j_  n(<)  (1) 

in  which  sW  and  yW  are  the  transmitted  and  received  signals,  respectively,  a®  is  the  target 
cross  section  or  radar  cross  section  (RCS).  It  is  assumed  that  n®  is  additive  noise  and  ui^  is  the 
returned  clutter,  a  distorted  version  of  the  transmitted  signal  s^\  In  the  work  presented  here,  it 
is  assumed  that  the  received  clutter  is  much  larger  than  the  white  thermal  noise,  i.e  u®  »  n®. 
Thus  (1)  turns  to 


y(4)  s s  q-WsW  ( when  »  n®)  (2) 

Classical  radar  equation  takes  target  cross  section  or  radar  cross  section  (RCS)  to  determine  the 
power  density  returned  to  the  radar  for  a  particular  power  density  incident  on  the  target.  Never¬ 
theless,  the  scattering  of  electromagnetic  energy  from  a  target  is  a  rather  complicated  phenomenon, 
which  depends  on  a  number  of  factors  such  as  target  geometry,  size,  shape,  aspect,  altitude  with 
respect  to  radar  antenna  etc.  Therefore,  it  has  been  advantageous  to  model  the  target  RCSs  as  a 
random  variable.  Some  common  fluctuation  models  are  now  available  in  the  open  literature,  i.e. 
Swerling  chi  ,  lognormal,  Rayleigh,  Weibull  as  a  compound  Rayleigh  distribution,  Shadowed  Rice 
target  etc.  In  this  work,  we  treat  the  target  fluctuation  as  Rayleigh  distribution  which  has  the 
probability  density  function  (pdf)  as 


=  p(-~ )  (3) 

Where  2cr^  is  the  mean  square  value  of  the  envelope  v. 

Clutter  is  the  unwanted  echoes  from  the  natural  environment  such  as  land,  sea,  rain,  birds, 
insects  etc.  Clutter  can  be  distributed  in  spatial  extent  in  that  it  is  much  larger  in  physical  size 
than  the  radar  resolution  cell.  There  are  also  point  or  discrete  clutter  echoes  that  produce  large 
backscatter.  Because  of  the  highly  variable  nature  of  clutter  echoes  it  is  often  described  by  a 
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probability  density  function.  Some  clutters  have  similar  distributions  as  the  target  fluctuation 
model,  e.g.,  Gaussian,  Rayleigh,  log-Normal  and  Weibull.  Nevertheless,  other  distributions  have 
been  proposed  to  describe  the  special  statistics  of  clutter  including  AT-distribution,  contaminated 
normal,  gamma  and  log- Weibull.  For  the  first  stage  of  this  work,  we  set  the  returned  clutter  follows 
Gaussian  distribution  with  zero  mean. 

3  Review  of  Previous  Decision  Fusion  Rules 

In  a  single  target,  single  hop  sensor  network,  the  typical  parallel  fusion  structure  in  a  flat  fading 
channel  is  depicted  in  Fig.  2.  The  received  signal  at  the  fusion  center  from  fcth  sensor  is  yk  = 
hkUk  +  rik,  where  hk  is  the  channel  fading  envelope  and  is  the  zero-mean  additive  Gaussian 
noise  with  variance  a2.  K  sensors  collect  data  generated  according  to  either  Hq  (there  is  no  target 
present)  or  H\  (there  is  target  present)  and  transmit  these  decisions  over  fading  and  noisy  channels 
to  a  fusion  center.  The  fusion  center  tries  to  decide  which  hypothesis  is  true  based  on  the  received 
data  yk  from  all  k. 


Figure  2:  Single-target,  single-hop  decision  fusion  model 

Assume  that  the  A;th  local  sensor  makes  a  binary  decision  Uk  G  {+1,-1},  with  false  alarm 
and  detection  probability  Pjk  and  Prik  respectively.  That  is,  we  have  Pjk  =  P[uk  —  l|fZo]  and 
Pdk  =  PJ  [uk  =  l|-ffi].  Several  decision  fusion  rules  have  been  developed  based  on  the  above  model 
in  [11].  Throughout  this  work,  we  use  A.lsi  to  denote  the  fusion  statistics  for  the  single  hop,  single 
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target  transmission  model. 


•  Optimal  LR-based  fusion  statistic  using  complete  prior  knowledge.  Assuming  complete  chan¬ 
nel  knowledge,  the  optimal  LR-based  fusion  statistic  was  derived  as 


A(s)  =  TT  Pdk^^  +  ~  pdk)i>j  ^ 

''  t^Pfk^  +  iX-Pfk)^ 


(4) 


where  Y  =  [yi,  ...,yk]T  is  a  vector  containing  observations  received  from  all  K  sensors,  tp^  = 
e-i(yk-hk)2/ 2a2)  an(^  ^(-)  _  e-(.(yk+hk)2 / 2<r2)_ 

•  LR-based  fusion  rules  using  only  fading  statistics  for  Rayleigh  fading  channel.  Implementing 
the  optimal  LR  test  as  in  (4)  requires  that  all  a  priori  information,  including  the  instantaneous 
channel  gains.  Under  the  Rayleigh  fading  model,  the  LR-based  fusion  statistic  using  only  the 
fading  parameter  is  summarized  below 


- 


K 

n 

k= 1 


Pdkvp  +  il-pdk^v 
pf^i+) + (1  -  Pfk)^ 


(5) 


where  =  1  +  \Z2'K^yke^l2yl^Q{-yk^),  =  1  -  V27r7 yke(-'y2y2/2') Q (yk j)  and  7  = 

+  a-2)  with  2al  being  the  mean  square  value  of  the  fading  channel,  is  the  noise 
variance,  and  Q(-)  is  the  complementary  distribution  function  of  a  standard  Gaussian  random 
variable. 


•  A  two-stage  approximation  using  the  Chair- Varshney  fusion  rule.  A  direct  alternative  to  the 
above  LR-based  fusion  rules  is  to  consider  the  information  transmission  and  decision  fusion 
as  a  two-stage  process:  first  yk  is  used  to  infer  about  uk:  then,  the  estimation  of  uk  are 
employed  in  the  optimum  fusion  rule.  Given  the  model  in  Fig.  2,  the  maximum  likelihood 
(ML)  estimation  for  uk  is  uk  =  sign{yk).  Applying  the  fusion  rule  derived  in  [11],  the  Chair- 
Varshney  fusion  rule  is  obtained  as 
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1  -  Pdk 


(6) 


lOO 


£**(£« 


+  J2  lo§ 

Vk>  0 


•  Fusion  statistics  using  a  maximum  ratio  combiner  (MRC).  In  the  low  SNR  regime,  if  the  local 

(s) 

sensors  are  identical, i.e.,  Pdk  and  Pfk  are  the  same  for  all  ks,  then  1  reduces  to  a  form 
analogous  to  an  MRC 


fe=i 


(7) 


•  Fusion  statistics  using  an  equal  gain  combiner  (EGC).  At  low  SNR  regime,  if  the  local  sensors 
are  identical, i.e.,  Pdk  and  Pfk  are  the  same  for  all  ks,  then  A^  reduces  to  a  form  analogous 
to  an  EGC 

=  (8) 

k— 1 

Among  the  above  five  fusion  rules,  A^  requires  complete  channel  knowledge  and  provides 
uniformly  the  most  powerful  detection  performance.  At  low  SNR,  the  MRC  statistic  provides  the 
best  performance  among  the  three  suboptimum  fusion  rules;  while  at  high  SNR,  the  Chair- Vaxshney 
fusion  rule  outperforms  the  MRC  and  the  EGC  statistics.  The  EGC  statistic,  however,  provides 
better  performance  over  a  wide  range  of  SNR  than  the  MRC  statistic  and  the  Chair- Varshney 
fusion  rule  and  requires  the  least  amount  of  prior  information. 
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4  MIMO  decision  fusion  model  for  multi-target  multi-radar  sensor 
networks 

In  our  scenario,  it  is  assumed  that  there  are  multiple  radar  sensors  and  multiple  stationary  targets 
in  the  field.  A  radar  detects  the  presence  of  a  target  and  generates  the  decision  data  according  to 
two  hypothesis:  Hq  :  there  is  no  target  present  and  Hi  :  there  is  target  present.  Each  decision  data 
is  transmitted  to  the  fusion  center,  normally  a  radar  sensor  as  well.  In  a  multi-hop  radar  sensor 
network,  the  decision  data  is  relayed  via  several  radars  to  reach  the  fusion  center.  When  there  are 
multiple  radar  sensors  and  multiple  targets  in  the  field,  the  data  fusion  problem  can  be  roughly 
modeled  as  a  Multi-Input  Multi-Output  (MIMO)  fusion  problem.  In  this  paper,  we  assume  the 
radar  sensors  are  disparate,  geographically  dispersed  in  the  field  such  that  the  radar  observations 
or  decisions  are  spatially  independent.  Fig.  3  illustrates  an  example  of  single-hop  decision  fusion 
problem. 


Figure  3:  MIMO  fusion  model 

Let  M  denote  the  number  of  radar  sensors  and  N  be  the  number  of  targets.  The  received  signal 
7^  at  the  fusion  center  at  time  t  is  a  N  x  M  matrix. 
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We  assume  that  the  radar  sensors  are  geographically  dispersed,  detection  decisions  are  made  at 
each  separate  local  radar.  The  element  yfV  of  (9)  is  the  decision  (target  present  or  absent  )  of  the 
jth  target  from  the  ith  radar  sensor.  yf9  can  be  represented  as 


i'll  aij  b%3  ^ 


(t) 


V 


(10) 


Observe  that  in  [11],  the  researchers  assume  that  both  the  false  alarm  Pf  and  probability  of 
detection  Pd  are  fixed  and  identical  for  all  local  sensors.  Moreover  there  is  no  correlation  between 
the  false  alarm  Pf  and  probability  of  detection  Pd.  In  radar  system  however,  this  assumption  is 
very  unpractical  especially  in  the  heavy  clutter  situation.  One  method  to  suppress  the  heavy  clutter 
is  to  use  constant  false  alarm  rate  (CFAR)  receiver.  CFAR  automatically  raises  the  threshold  level 
to  keep  clutter  echoes  and  external  noise  from  overloading  the  automatic  tracker  with  extraneous 
information.  In  our  study,  we  assume  the  receivers  of  all  radar  sensors  are  CFAR  which  implies 
that  though  the  false  alarm  rate  is  a  constant,  the  probability  of  detection  of  each  local  radar 
sensor  varies.  We  use  Pf  as  the  fixed  false  alarm  rate  and  Pdi  to  denote  the  distinct  probability  of 
detection  at  radar  sensor  i  throughout  this  work. 

We  next  derive  the  MIMO  decision  fusion  rules  for  the  multi-target  radar  sensor  networks 
starting  from  the  single-hop  radar  sensor  networks. 


4.1  Decision  fusion  rule  in  multi-target,  single- hop  radar  sensor  networks 

•  Assume  we  have  complete  knowledge  of  the  target  fluctuation  coefficients,  the  optimal  LR- 
based  fusion  rule  for  the  yth  target  was  derived  as 


(11) 


Ad)  + 


j  =  1,  ...,N 


" 13  ’ 

Complete  decision  vector  for  N  targets  are  denoted  as  Afl\=  [Ai,  Ai, ...,  Ajv]t  and 


e  — (2/il  — 0!ii)2/2cr2 

e-{yi2-oii2)2 /2a2 

,fM  = 

e-{yiN-aiN)2/2a 2 

e-(yil+an)2/2a2 

e-(yi2+ai2)2 /2a2 

£(->  = 

g — (j/iJV +diN  )2 /2a2 

(12) 


•  LR-based  fusion  rules  using  only  target  fluctuation  statistics.  Under  the  assumption  of  Gaus¬ 
sian  clutter  model  and  Rayleigh  target  fluctuation  model,  the  LR-based  fusion  statistic  using 
only  the  target  fluctuation  coefficients  is  summarized  below 


M 


Af»=n 


/(+)  _i_  ri _ p^\t/(_) 


(-) 


j  —  1>  — )  N 


(14) 


ti  PfVW  +  il-Pf)*), 
where  7  =  {pja^^/a2  +  a2)  with  2 u2  being  the  mean  square  value  of  the  target  fluctuation 
model,  a2  is  the  clutter  variance. 
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1  +  y/2Tfyyiie^2y^/‘2')Q{-yn'r) 

1  +  y/^fyy^e^^^Qi-y^) 


i  =  1,  ...,M  (15) 


1  4-  V2^yiNe^/^Q(-yiNl) 


1  -  v/2i7yiie(72yii/2)Q(yii7) 

1  -  s/^^yae^v^^Q{yai) 


i  =  1,  ...,M 


(16) 


[  i  -  V^nyiNe(-l2y^/2^Q{yiN'y)  J 

•  Fusion  statistics  using  a  maximum  ratio  combiner  (MRC) .  In  the  low  SNR  regime,  if  the  local 
radar  sensors  are  identical, i.e.,  P^i  and  Pfi  are  the  same  for  all  is,  then  A^  reduces  to  a  form 
analogous  to  an  MRC 


1  M 

Af  =M  I>i2/y  J  =  1,  (17) 

i= 1 

•  Fusion  statistics  using  an  equal  gain  combiner  (EGC).  At  low  SNR  regime,  if  the  local  radar 
sensors  are  identical,  i.e.,  Pdi  and  Pfi  are  the  same  for  all  ks,  then  A)2)  reduces  to  a  form 
analogous  to  an  EGC 


M 


i=l 


N 


(18) 


12 


4.2  Decision  fusion  for  multi-target  multi-radar  sensor  networks 

When  considering  a  multi-hop  radar  sensor  networks,  decision  fusion  problem  for  multiple  targets 
could  be  quite  complicate.  For  simplicity,  we  follow  the  assumption  in  single-hop  case  and  assume 
the  relay  radar  sensors  have  no  direct  observation  of  all  the  targets.  Comparing  Fig.  3  with  Fig.  4, 
the  above  assumption  assures  that  the  MIMO  fusion  problem  for  multi-hop  remains  the  same 
dimension  as  the  one  in  the  single-hop  case. 


Figure  4:  MIMO  fusion  model  for  multi-hop  radar  sensor  networks 

We  make  the  further  assumption  that  each  relay  radar  makes  a  simple  hard  decision  on  the 
signal  transmitted  from  its  last  hop  radar.  Therefore,  given  that  the  clutter  is  Gaussian,  we  have 


sk  =  sign(ak  1sk  1+uk  *)  (19) 

Hence,  the  ultimate  received  signals  at  the  fusion  center  transmitted  from  all  the  M  last  hop 
radars  have  the  similar  form  as  (9).  We  also  assume  that  the  Rayleigh  RCS  has  unit  power,  i.e., 
E[a?j\  =  1  and  Gaussian  clutter  has  variance  a2  to  facilitate  SCR  calculation  later  in  the  paper. 

Implementing  the  decision  rules  for  single  target,  multi-hop  WSNs  in  [11]  to  our  multi-target, 
multi-hop  radar  sensor  networks,  we  get  the  decision  fusion  rules  as  follows. 
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•  Optimal  LR-based  Fusion  Rule 


In  multi-hop  radar  sensor  network,  we  assume  only  the  first  hop  radar  sensors  are  CFAR  with 
false  alarm  P '9  and  probability  of  detection  Let  Pj)  be  the  probability  of  detection  at  the 
?'th  radar  in  the  last  relay,  [11]  has  proved  that  for  one  given  target  detection,  Pj.  «  P%.  and 
Pj  =  Pj  at  high  signal  to  clutter  ratio  (SCR).  At  low  SCR,  Pj)  and  Pj can  be  approximated 
as 


e-{2ynan)Mi  /a2 
e-{2yi2<Xi2)Mil<r 2 

i  =  (23) 

e-(2  yiN<*iN)Mi  /a2 
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Denote  A1-2'1  as  the  LR  rule  that  corresponds  to  the  case  when  only  the  target  RCS  statistics 
are  known.  A^  can  be  derived  for  the  multi-hop  MT-MR  sensor  networks. 


Af 


tt  1  +  \pdi  ~  Qhyij)W2n'Wije(-™i'>2/2  ,  , 

m  nr-  J  — 


(24) 


i  +  [Pf  -  /2 

where  y.  —  [yn,yi2,  •■■ViN]  is  a  vector  containing  all  N  decision  data  from  radar  i.  P^  and  Pf 
are  denoted  as  in  (20)  and  (21). 


•  Decision  fusion  rules  of  Maximum  ratio  combiner  (MRC)  and  equal  gain  combiner  (EGC) 
have  the  identical  format  as  the  single  hop  case  because  for  both  of  them,  the  decision  fusion 
only  depends  on  the  last  hop. 


5  Simulation  Results 

In  this  section,  we  simulate  the  performance  of  the  decision  fusion  rules  derived  for  multi-target 
radar  sensor  networks.  For  ease  of  SCR  calculation,  we  assume  that  all  the  target  RCSs  have  unit 
power,  i.e,  E[afj]  =  1.  Binary  decisions  are  made  at  the  local  radar  sensors  and  the  relay  radars. 
The  target  RCS  are  generated  using  the  Rayleigh  model. 

For  multi-target,  single-hop  radar  sensor  network  and  multi-target,  multi-hop  radar  sensor 
network,  we  are  interested  to  compare  the  four  decision  fusion  rules: 

•  Optimal  LR-based  rule 

•  LR-based  rule  with  target  RCS  statistics  only 

•  MRC  rule 

•  EGC  rule 

In  all  simulations,  we  assume  the  constant  false  alarm  rate  Pf  =  0.01  (for  multi-hop  case, 
Pf  =  0.01  is  the  one  at  the  first  hop).  Under  hypothesis  Hq  when  a  target  is  detected  as  absent, 
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Pf  —  Q(^~)-  We  then  know  the  detection  threshold  Xt  =  Q(Pf)  1cr.  When  a  target  is  detected, 
i.e.,  hypothesis  Hi,  the  probability  of  detection  P*  =  ). 


Figure  5:  Single  hop 

Fig.5  gives  the  probability  of  miss  detection  vs.  the  SCR  for  multi-target,  single-hop  radar 
sensor  network.  There  are  total  two  stationary  targets,  three  radar  sensors  in  the  field.  The 
optimal  LR-  based  fusion  rule  provides  the  most  powerful  detection  performance  but  it  requires 
complete  target  RCS  knowledge.  The  LR-based  rule  with  target  RCS  statistics  approaches  the 
optimal  LR-based  rule  in  low  SCR  and  have  about  ldB  loss  in  higher  SCR.  MRC  and  EGC  have 
similar  performance.  Both  are  little  worse  than  the  LR-based  rule  with  target  RCS  statistics. 

Fig.6  and  Fig.7  are  the  performance  for  multi-target,  multi-hop  radar  sensor  networks.  Fig.6 
shows  the  probability  of  miss  detection  when  each  of  the  three  radar  sensors  reaches  the  fusion 
center  in  two  hops.  Fig.7  shows  the  performance  when  the  three  radar  sensors  reach  the  fusion 
center  in  unequal  hops.  In  our  simulation,  we  assume  that  one  radar  sensor  reaches  the  fusion 
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Figure  6:  Multi-hop,  equal  hops 

center  in  two  hops  while  the  others  in  single  hop.  As  expected,  the  probability  of  detection  for  the 
single-hop  case  outperforms  the  one  for  multi-hop. 

6  Conclusions 

In  this  paper,  we  presented  the  MIMO  decision  fusion  rules  for  multi-target,  multi-hop  radar  sensor 
networks  under  the  assumption  that  the  target  RCS  is  Rayleigh  model  and  clutter  echoes  follow 
Gaussian.  We  derived  the  optimum  LR-based  fusion  rule  and  a  sub-optimal  LR-based  fusion  rule 
with  the  target  RCS  statistics  only.  Simulation  results  show  that  the  MIMO  fusion  rules  approach 
the  optimal-LR  and  outperforms  MRC  and  EGC  at  high  signal  to  clutter  ratio  (SCR). 

In  many  cases,  two  or  more  local  radars  may  share  a  common  relay  node  on  their  way  to  the 
fusion  center.  Under  this  circumstances,  the  independent  assumption  made  toward  the  target  RCS 
may  not  be  held.  It  is  actually  a  very  interesting  space  correlation  issue.  As  the  radar  observations 
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Figure  7:  Multi-hop,  unequal  hops 

always  demonstrate  time  correlation,  further  research  will  be  focused  on  this  space-time  correlation 
of  radar  sensor  networks. 
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Abstract 

Interferences  due  to  the  hostile  environment  and  the  Multi-User  Access  are  critical 
factors  affecting  performance  of  the  Wireless  Sensor  Networks.  There  is  clearly  a  need 
of  a  system  that  can  survive  from  the  severe  interference.  In  this  paper,  we  designed  a 
hybrid  Frequency  Hopping/Time  Hopping-Pulse  Position  Modulated  (FH/TH-PPM)  UWB 
system  for  Wireless  Sensor  Networks  to  confront  the  hostile  environment.  FH  and  TH  are 
both  used  to  get  as  much  diversity  gain  as  possible.  An  exact  analysis  is  also  derived  for 
precisely  calculating  the  bit  error  rates  for  both  Additive  White  Gaussian  Noise  channel 
and  path-loss  channel  in  the  presense  of  multitone/pulse  (tone  in  frequency  domain  and 
pulse  in  time  domain)  interference  and  Multi-User  Interference. 

Index  Terms  :  Time  Hopping,  Frequency  Hoppoing,  PPM,  Wireless  sensor  networks, 
UWB,  BER 
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1  Introduction 


Wireless  sensor  networks  are  becoming  more  popular  for  an  ever  increasing  range  of  applications 
with  improvements  in  device  size,  power  control,  communications  and  computing  technology. 
Since  2002  there  has  been  great  increasing  popularity  of  commercial  applications  based  on 
Ultra  WideBand.  This  in  turn  has  ignited  interest  in  the  use  of  this  technology  for  sensor 
networks.  Actually,  UWB  systems  have  potentially  low  complexity  and  low  cost;  have  a  very 
good  time  domain  resolution,  which  facilitates  location  and  tracking  applications.  So,  UWB 
wireless  sensor  networks  are  promising. 

One  of  the  most  important  applications  of  WSN  is  in  battle  field,  which  means  there 
exist  hostile  interferences.  Frequency  Hopping  (FH)  technology  offers  an  improvement  in 
performance  when  the  communication  systems  is  attacked  by  hostile  interference  and  reduce 
the  ability  of  a  hostile  observer  to  receive  and  demodulate  the  communication  signal.  This 
kind  of  inherent  property  finds  it  a  potential  position  in  the  UWB  sensor  networks.  Based  on 
the  UWB  definition  released  by  the  FCC  (FCC,  2002)  that  a  signal  is  UWB  if  its  bandwidth 
exceeds  500  MHz,  the  overall  7.5  GHz  bandwidth,  that  is,  frequencies  in  the  range  3.1  GHz 
to  10.6  GHz  as  based  on  the  FCC  ruling,  can  be  split  into  smaller  frequency  bands  of  at  least 
500  MHz  each.  This  character  inspired  us  to  design  a  hybrid  FH/TH-PPM  UWB  system. 

Sensor  Network  communication  systems  have  to  be  multi-user  accessible,  which  means 
different  users/sensor  nodes  are  allowed  to  share  the  same  physical  medium  for  transmitting 
and  receiving  different  data  flows.  In  TH-UWB,  the  spectrum  of  the  impulse  radio  signal  is 
usually  shaped  by  encoding  data  symbols  using  TH  sequences,  which  are  typically  described 
as  pseudorandom  PN  codes.  These  smae  sequences  can  also  serve  as  users’  signatures  and 
ensure  access  to  the  medium  to  multiple  users.  Therefore,  this  so  called  Time  Hoppping 
Multiple  Access  (THMA)  technology  will  be  a  reliable  choice  in  this  case.  However,  in  a 
realistic  scenario  where  systems  cannot  achieve  ideal  synchronization,  multi-user  interference 
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(MUI)  will  be  another  crucial  factor  besides  the  previous  mentioned  hostile  interfence  to  affect 
the  system  performance.  Clearly,  the  simple  Signal-to-Noise-Ratio  (SNR)  is  less  than  enough 
to  give  a  comprehensive  performance  evaluation  for  sensor  networks.  Therefore,  Signal-to- 
Interference-plus-Noise-Ratio  (SINR)  should  be  analyzed  instead. 

Several  efforts  have  been  made  in  the  recent  past  for  evaluating  the  effect  of  MUI  on  symbol 
error  rate  with  single-user  reception  in  an  AWGN  channel  [7,  6].  However,  hostile  jammer 
are  considered  in  none  of  them,  which  would  be  challenged  in  this  paper.  For  example,  a  TH- 
UWB  sensor  network,  which  is  set  up  in  a  hostile  environment,  it  is  feasible  for  the  enemies  to 
estimate  the  shape  of  a  pulse.  Therefore,  a  repeated-imitated-pulses  intruder  will  be  sent  out 
to  degrade  the  performance  of  the  network.  The  main  contribution  of  this  paper,  we  put  the 
hositle  intereference  along  with  the  MUI  into  consideration,  and  through  the  precise  analysis 
a  closed-form  performance  analysis  expression  can  be  got. 

The  rest  of  this  paper  is  organized  as  follows.  The  system  models,  including  the  transmis¬ 
sion,  channel  and  receiver,  will  be  introduced  in  Section  2.  The  MUI  and  hostile  interference 
will  be  studied  in  Section  3  and  Section  4  respectively,  and  the  SINR  and  closed-form  BER 
will  be  derived  as  well.  Numerical  results  and  comparisons  will  be  present  in  Section  5;  and 
conclusions  are  made  in  Section  6. 

2  System  Models 

In  the  proposed  system,  there  are  Np  non-overlapping  FH  bands,  each  with  bandwidth  Bh 
where  Bh  is  the  bandwidth  required  to  transmit  a  TH-PPM  signal  in  the  absence  of  FH.  Let 
sk(t)  denotes  the  k- th  user’s  signal  at  time  t  in  this  FH/TH-PPM  UWB  system  with  totally 
Nu  users,  and  it  takes  the  form 

I  771  +03 

«*(*)  -  V  if  E  -  JTf  -  ct^  -  d #)«])  (!) 

S  j—  OO 
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where  p(t) is  a  chip  waveform,  which  can  take  arbitrary  time-limited  pulse  shapes  proposed 
specifically  for  UWB  communication  systems,  and  is  normalized  to  satisfy  p2(t)dt  =  1. 
The  notations  and  parameters  are: 

•  Ns  is  the  number  of  pulses  used  to  transmit  a  single  information  bit.  Tf  is  the  time 
duration  of  a  frame.  In  general  case,  Ns  >  1  pulses  carry  the  information  of  one  bit.  The 
bit  duration  Tj,  should  satisfy,  T&  >TfNs. 

•  Eb  is  the  energy  per  information  bit.  \J^fs  is  the  normalized  energy  in  each  symbol. 

•  Cjh(k)Tc  is  the  time  shift  introduced  by  the  TH  code.  Tc  is  the  chip  duration.  E-L{k)  is  the 

j-th  coefficient  of  the  TH  sequence  used  by  user  k;  it  is  pseudo-random  with  each  element 
take  an  integral  in  the  range  [0,  —  1],  where  is  the  number  of  hops.  Tc  <  Tf/N h 

should  be  satisfied. 

•  The  dj(k)5  term  represents  the  time  shift  introduced  by  PPM  modulation.  In  our  system, 
2PPM  is  only  considered.  Therefore,  dj(k )  represents  the  j-th  binary  data  bit  (0  or  1) 
transmitted  by  the  k- th  user;  <5  is  the  PPM  shift. 

•  Cjh(k)  =  y/(2)cos(2Trfkj)  is  the  fc-th  user’s  spreading  code  during  j-th  frame. 

Notice  that  each  symbol  chooses  one  of  the  Np  sub-bands  to  transmit  the  signal,  however,  in 
each  sub-band,  the  transmission  is  TH-2PPM. 

In  WSN,  in  order  to  save  energy,  sensor  nodes  choose  to  be  idle  for  most  of  the  time.  The 
number  of  nodes  who  are  actually  in  the  status  of  communication  is  unknown.  However,  the 
total  number  of  sensor  nodes  in  the  network  and  the  access  rate  A,  i.e.,  the  rate  that  a  node 
in  the  communication  status,  for  each  node  are  easy  to  know.  Therefore,  the  users  of  the 
communication  system,  TVj,  is  a  Binomial  random  variable.  Since  the  total  number  of  nodes, 
Njj,  is  very  large,  we  can  approximate  the  Binomial  distribution  to  a  Gaussian  random  variable 
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with  the  mean  Nu A  and  variance  Nu A(1  —  A),  as 


fnzfai)  = 


0-{nl-Nu\)2 /2J%A(1-A) 


(2) 


v/27riVc/A(l  -  A) 

For  the  iVj  users,  they  randomly  choose  one  of  the  sub-bands  to  transmit  the  signal 
according  to  Cjh(k)  symbol  by  symbol.  It  is  also  a  Binomial  random  variable  with  the  coefficient 
1/Np-  To  simplify  the  problem,  we  assume  that  the  users  are  distributed  optimally,  so  the 
number  of  users  share  the  same  channel,  Nu,  should  be  expressed  as 


nu  =  n%/nf. 


(3) 


Assume  over  one  Tf,  Nu  users’  signals  are  simultaneously  transmitted  over  a  channel  with 
Lc  paths  [2] ,  the  composite  waveform  at  the  output  of  the  receiver  antenna  maybe  written  as 

r(f)  =  a\ k)sf\t  -  T,(fc))  +  n{t )  +  I(t )  (4) 

k= l  l=i 

where  n(t)  is  the  additive  Gaussian  Noise  withe  two-sided  power  spectral  density  No/2,  I(t)  is 
the  hostile  jammer  interference,  and  are  the  attenuation  and  the  delay  affecting  replica 
of  the  fc-th  user’s  signal  traveling  through  the  l- th  path.  In  writing  (  4),  we  have  implicitly 
assumed  a  static  channel,  meaning  the  ^  and  are  either  fixed  or  vary  so  slowly  that 
they  are  practically  constant  over  several  bits. 

Consider  ^'(t)  to  be  the  desired  user,  all  the  other  Nu  —  1  users  signals  are  interference 
signals.  We  make  assumptions 

•  The  reference  transmitter  and  receiver  of  a  reference  link  are  perfectly  synchronized  under 
the  coherent  detection  hypothesis; 

•  A  dominant  path  exists  that  conveys  the  major  part  of  the  desired  user’s  energy  [3]; 


the  decision  statistic  over  the  1-st  user’s  j-th  symbol  is  obtained  as 


rCB 

JjTf 


0'+l)7> 


r(t)v(t  —  Tj 


(i) 


■jTf  -  cf]Tc)dt, 


(5) 
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where  v{t )  =  p(t)  —  p(t  —  5)  is  the  correlation  template  waveform. 

Afterwards,  a  simple  hard  decision  of  the  information  bit  based  on  Zj,  j  =  0, 1, . . . ,  Ns  —  1 
will  be  made  by  majority  law. 

3  Multi-User  Interference  Analysis 

In  this  section,  we  will  first  focus  on  the  analysis  of  MUI  with  the  absence  of  hostile  jammer 
interference.  We  assume  there  is  no  inter-channel-interference.  Therefore,  the  received  signal 
of  1-st  user’s  j-th  symbol  can  be  expressed  as:: 

ri  (t)  =  ( t )  +  rj>mui(t)  +  n(t)  (6) 

where  rjtinui(t)  is  the  MUI  contribution  at  the  receiver  input.  If  the  users  are  many  and  have 
comparable  powers,  we  can  approximate  the  MUI  as  a  white  Gaussian  process  by  the  central 
limit  theorem  [5]  and,  as  such,  it  can  be  lumped  into  the  additive  Gaussian  Noise, 

wtot(t)  =  rjtmui(t)  +  n(t)  (7) 

and  Wtot{t )  is  still  a  white  Gaussian  process.  Correspondingly,  the  minimum  error  probability 
can  be  achieved  by  computing  (  5). 

However,  we  still  need  to  evaluate  the  energy  of  the  MUI.  Since  the  system  is  asynchronous, 
we  need  to  consider  all  cases  where  a  pulse  originated  by  any  of  the  transmitters  but  TX1,  is 
detected  by  the  receiver.  First  of  all  we  need  to  analyze  the  noise  provoked  by  the  presence  of 
one  alien  pulse  at  the  output  of  the  receiver  by  using  the  similar  method  as  in[l] .  , 

(8) 

where,  E'^j-  =  (Ef,/Ns),  and  here  we  suppose  aW  =  1  Vfc. 

Since  r^>  is  uniformly  distributed  over  [0,  T/),  however,  the  region  the  MUI  noise  can  affect 
to  the  desired  user  ,  is  only  at  [— TP,TC],  with  Tc  =  2Tp[l}.  Hence, 

=  ^/T  {/^J0  P(*-T(fc))u(t)d^  dr(fc)  (9) 


mui^  (t^))  =  J E^x  [  P(t  ~  r^)vtdt 
Jo 
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4  Performance  Analysis  with  Multitone/pulse  Interference 

In  this  section,  the  SIR  for  the  hostile  interference  part  is  obtained.  As  previously  mentioned, 
in  a  hostile  environment,  which  is  a  common  case  for  sensor  networks,  it  is  feasible  for  an 
enemy  to  estimate/detect  the  pulse  waveform,  furthermore,  send  the  imitational  waveform  to 
interfere  the  communication  system.  However,  it  is  not  economy  or  efficient  for  the  interference 
to  cover  all  the  frequency  bandwidth.  So,  what  we  study  here  is  a  multitone/pulse  (tone  in 
frequency  domain;  and  pulse  in  time  domain)  interference,  which  implies  that  the  jamming 
signal  consists  of  one  or  more  tones/sub-bands  transmitted  within  the  total  bandwidth;  and  it 
has  the  same  pulse  shaper  as  the  transmitted  2-PPM  signal  does. 

We  make  the  following  assumptions: 

•  The  multitone/pulse  interference  has  a  total  power  Pj,  which  is  transmitted  in  a  total  of 
q  equal  power  interfering  tones  spread  randomly  over  the  spread  spectrum  bandwidth; 

•  The  time  duration  for  the  interference  pulse  is  the  same  as  the  time  duration  of  the 
transmitted  signal  pulse  p(t),  which  is  denoted  as  Tp.  To  simplify  the  problem,  we 
suppose  Tc  =  2 Tp  and  5  =  Tp.  The  hop  period  of  the  interference  is  also  Tp,  and  each 
hop  is  independent. 

•  The  multitone/pulse  interference  can  catch  the  signal  pulse  with  the  perfect  timing.  We 
consider  the  scenario  that  there  is  at  most  one  interference  per  FH  sub-band.  Hence,  in 
one  hop,  the  probability  that  a  FH  band  contains  an  interference  tone/pulse  is  q/Np. 
Observe  the  transmitted  signal  as  in  (1),  the  signal  hops  both  in  the  frequency  domain 
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and  in  the  time  domain  symbol  by  symbol.  Therefore,  our  analysis  will  first  focus  on  one 
symbol. 


Figure  1:  An  example  of  the  waveform  of  a  2-PPM  signal. 


For  one  symbol,  no  matter  what  Cjh(k )  and  c^h(k)  are,  it  is  a  2-PPM  signal  shown  as  in 
Fig  1.  The  left  one  is  the  waveform  when  dj(k)  =  0,  and  the  right  one  is  the  waveform  when 
dj(k )  =  1.  We  partition  the  symbol  duration  as  two  time  slots,  hence,  for  the  multitone/pulse 
interference,  there  are  two  hops.  And  because  in  each  hop,  it  is  independently  distributed, 
there  should  be  totally  four  cases  with  regard  to  the  jammer  interference  for  each  symbol: 


1.  Casel.  There  is  no  jammer  interference  in  either  of  two  slots,  and  the  probability  of 


easel  is 


P{ —1>-(1 


2.  Case2.  There  is  jammer  interference  in  each  slot,  and  the  probability  of  case2  is 


P {case2}  = 


g 

Nf 


Q 

Nf 


(19) 
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3.  CaseS.  There  is  one  and  only  one  jammer  interference  pulse,  and  it  is  at  the  same  slot 
as  the  signal  pulse.  The  probability  of  case3  is 

4.  Case4 ■  There  is  one  and  only  one  jammer  interference  pulse,  and  it  is  not  at  the  same 
slot  as  the  signal  pulse.  The  probability  of  case4  is 

i>{«Me4}  =  (l-X).^.  (21) 

The  received  signal  of  the  j- th  symbol  of  1-st  user  can  be  expressed  as: 

rj{t)  =  rj{t)  +  I  jammer  {t)  T  (22) 

where  rj?(t)  and  1 jammer  00  are  the  jammer  interference  contributions  at  the  receiver  input, 
and  Wtot(t )  accounts  for  both  the  thermal  and  MUI  noise  contributions,  and  is  still  a  white 
Gaussian  process  as  proved  in  Section  3.  Hence,  a  maximum  a  posteriori  (MAP)  approach 
can  be  adopted  here  to  get  the  minimum  error  probability.  For  different  cases  of  the  jammer 
interference,  the  detection  boundaries  are  shown  in  Fig.  2. 

Hence,  we  can  get  the  SINRjammer  straightforwardly. 


For  easel  and  case2, 


oriyz?  _  RX 

Jl  iv  *Xjammer\casel,2  »j/  ■ 

-,V0 


(23) 


•  For  case  3, 


SI NRj  ajnmer  |  case  3 


(v^+^/f)2 


K 


(24) 


•  For  case  4, 


(v^-v/f)2 

SIN  Rjararner \case4  jyv 


(25) 
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Figure  2:  The  MAP  detection  rule  for  all  the  cases. 


Removing  the  conditioning  on  cases,  we  get 


Considering  only  Nu  is  a  random  variable,  we  should  take  (  2)  and  (  3)  into  (  30), 


Prs  = 


_ 1 _ 

Nf^2tvNu\(1  -  A) 


prie-(nu-NuX)2/2Nu\(l-X)dnu 


(31) 


After  we  got  the  symbol  error  rate  Prs,  it  is  easy  for  us  to  obtain  the  bit  error  rate  Pr 5  by 
majority  law. 

Ns 

Prb=  Y.  CkNsPrks(l-Prs)N’~k  (32) 

where  |"  ]  is  the  ceiling  operation,  and  C^s  is  an  Ns-choose-k  Binomial  coefficient,  i.e.,  C^s  = 

k'.(Ns-k)'. 

Ns\  ■ 


5  Numerical  Results  and  Comparisons 

The  parameters  of  the  example  UWB  systems  are  listed  in  Table  1. 

•  The  discussion  on  Ns-, 

We  fix  Np  —  20,  q  =  8,  Nu  =  10  and  the  energy  of  the  signal  and  jammer  interference 
ratio  Eb/Pj  =  5 dB,  and  compare  the  Symbol  Error  Rate  (SER)  and  Bit  Error  Rate 
(BER)  among  Ns  —  1,3,5, 7.  The  results  are  shown  in  Figure.  3  and  Figure.  4  respec¬ 
tively.  For  SER,  because,  the  more  symbols  used  to  transmit  one  bit,  the  energy  for 
each  symbol  is  less,  SER  is  increasing  when  the  Ns  increases.  However,  BER  is  more 
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Table  1:  Parameters  of  the  example  FH/TH-PPM  UWB  system 


Parameter 

Notation 

2nd-order  mono-cycle 

shaping  factor  for  the  pulse 

e 

0.25ns 

time  shift  introduced  by  PPM 

<5 

0.5ns 

pulse  duration 

0.5ns 

frame  duration 

mm 

8  ns 

chip  duration 

Tc 

Ins 

number  of  hops 

Nh 

6 

meaningful  here,  and  obviously,  the  more  symbols  we  used  to  transmit  one  information 
bit,  the  better  performance  we  can  achieve.  The  curves  drop  quickly  from  SNR  —  OdB 
to  15dJ3,  however,  after  15 dB,  become  flat,  which  caused  by  the  jammer  interference. 


Figure  3:  The  average  SER  for  different  Ns. 

•  The  discussion  on  E^/Pj-, 

We  set  Nj?  =  20,  Ns  —  3,  q  =  8,  and  Nu  =  10,  and  compare  the  SER  and  BER  among 
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Figure  4:  The  average  BER  for  different  Ns- 

Eb/Pj  —  0, 5, 10 dB.  Figure.  5  and  Figure  6  show  the  results.  For  both  SER  and  BER, 
larger  Eb/Pj  can  guarantee  better  performance.  From  Eb/Pj  =  5 dB  to  Eb/Pj  =  10 dB, 
the  performance  gain  is  very  limited,  the  reason  of  which  is  when  Eb/Pj  is  higher  than 
somebetter  threshold,  the  jammer  interference  is  too  weak  to  give  any  impact  to  the 
system. 

10° 


g>  10' 
E 


0  5  10  15  20  25  30 

Figure  5:  The  average  SER  for  different  Eb/Pj. 
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Figure  6:  The  average  BER  for  different  E^/Pj. 

•  The  discussion  on  g; 

Nf,  Ns,  Nu  and  Eb/Pj  are  fixed  at  20,  5,  10  and  5 dB  respectively. We  try  to  evaluate 
the  performance  for  q  =  2,8, 18.  At  the  first  glance,  larger  q  may  be  thought  to  yield 
worse  performance,  because  large  q  means  the  probability  that  a  jammer  interference 
bumps  the  information  signal  is  higher.  However,  we  need  to  notice,  high  q  also  means 
the  energy  of  the  jammer  interference  for  each  sub-band  is  less,  because  the  total  jammer 
interference  power  is  fixed.  We  can  get  the  same  conclusion  in  the  Figure.  7  and  Figure.  8. 
The  worst  and  best  performances  are  get  at  q  =  2  and  q  =  8  respectively. 

•  The  discussion  on  Np-, 

We  evaluate  the  performance  for  Np  =  1,5, 10, 20  when  Ns  =  1,  and  the  number  of  users 
who  are  at  communication  status  is  100.  We  need  to  evaluate  how  partitioning  Np  can 
decrease  the  MUI,  therefore,  we  set  it  as  a  jammer  interference  free  channel.  Obviously, 
Figure.  9  shows  that  we  can  get  better  performance  when  Np  is  larger.  Considering, 
more  sub-bands  partitioned  means  more  cost,  the  Np  should  be  set  at  an  appropriate 
value  as  long  as  the  Quality  of  Sevice  (Qos)  is  satisfying. 
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Figure  7:  The  average  SER  for  different  q. 

•  The  discussion  on  Nu] 

We  set  Np  =  20,  q  =  8,  Na  =  5  and  E^/Pj  =  5dB.  Nu  is  equal  to  5, 10, 15, 20  respectively 
to  get  different  performance,  which  are  shown  in  Figure.  10  and  Figure.  11.  At  high  SNR, 
the  performance  is  degraded  quickly  when  Nu  becomes  larger.  That  is  because  the  MUI 
is  related  with  Ef,  under  the  assumption  that  each  user  has  comparable  power. 

•  The  discussion  on  A; 

We  proved  in  Section  2  that  the  Nu  is  approximately  a  Gaussian  RV.With  known  Nu,  the 
number  of  sensor  nodes  in  the  WSN,  but  unkonwn  Nu,  the  number  of  users  that  would 
share  the  same  sub-band,  we  need  to  calculate  the  SER  as  in  (31).  We  set  Nu  =  10, 000, 
Np  is  set  as  20  and  we  assume  the  users  are  optimally  distributed.  For  different  access 
rate,  A  =  0.01  and  A  =  0.02,  the  performances  are  shown  in  Figure  12  and  Figure.  13, 
we  can  see  though  the  Nu  are  the  same,  the  difference  in  A  would  yield  totally  different 
performance. 
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Figure  8:  The  average  BER  for  different  q. 

6  Conclusions 

In  this  paper,  we  make  an  performance  analysis  with  the  presence  of  the  multi-tone/pulse  jam¬ 
mer  interference  and  multi-user  interference  based  on  the  hybrid  FH/TH-PPM-UWB  system. 
We  get  an  accurate  expressions  of  SER  and  BER  with  the  presence  of  MUI  and  hostile  jammer 
interference.  We  evaluate  the  performances  for  different  number  of  symbols  to  carrry  one  in¬ 
formation  bit  Ns;  the  signal  to  jammer  interference  ratio  E\,/Pj\  the  number  of  FH  sub-bands 
Nf',  the  number  of  tones  q  of  the  jammer  interference;  the  number  of  users  sharing  the  same 
sub-band  Nu-,  and  the  total  number  of  users  in  the  wireless  sensor  networks  Njj,  and  the  access 
rate  for  each  sensor  nodes  A,  in  terms  of  BER  and  SER  so  as  to  show  how  these  parameters 
affect  to  the  FH/TH-PPM  UWB  system. 
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Figure  9:  The  average  BER  for  different  Np. 
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Abstract 

In  this  paper,  we  address  a  fundamental  problem  in  Wireless  Sensor  Networks,  how  many  hops 
does  it  take  for  a  packet  to  be  relayed  for  a  given  distance?  For  a  deterministic  topology,  this  question 
reduces  to  a  simple  geometry  problem.  However,  a  statistical  study  is  needed  for  randomly  deployed 
WSNs.  We  propose  a  Maximum  Likelihood  decision  based  on  the  joint  pdf  of  ( H,n ),  which  is  also 
derived  in  this  paper.  Since  the  solution  is  not  closed-form,  we  also  propose  an  attenuated  Gaussian 
approximation  for  the  joint  pdf.  We  show  that  the  approximation  visibly  simplifies  the  decision  process 
and  the  error  analysis.  The  latency  and  energy  consumption  estimation  are  also  included  as  application 
examples. 


I.  Introduction 

The  recent  advances  in  MEMS,  embedded  systems  and  wireless  communications  enable  the 
realization  and  deployment  of  wireless  sensor  networks  (WSN),  which  consist  of  a  large  number 
of  densely  deployed  and  self-organized  sensor  nodes.  The  potential  applications  of  WSN,  such 
as  environment  monitor,  often  emphasize  the  importance  of  location  information.  Accordingly 
geographic  routing  [1]  was  proposed  to  handle  such  requirement.  Most  likely,  a  packet  is  not 
routed  to  a  specific  node,  but  a  given  location.  An  interesting  question  arises  as  “how  many 


hops  does  it  take  to  reach  a  given  location?”  The  prediction  of  the  number  of  hops  is  important 
not  only  in  itself  but  also  in  helping  estimating  the  latency  and  energy  cost,  which  are  both 
important  to  the  viability  of  WSN. 

The  question  could  become  very  simple  if  the  sensor  nodes  are  manually  placed.  For  example, 
suppose  sensor  nodes  are  place  in  a  square  grid  with  separation  of  d.  Obviously,  the  connectivity 
depends  on  the  comparison  of  d  and  the  transmission  range  R.  Suppose  d  <  R  <  \/2 d,  this  is 
simply  a  4-connectivity  network.  For  any  node,  the  possible  distance  of  its  first-hop  neighbors 
is  {d},  the  possible  distances  of  its  second-hop  neighbors  are  {V2d,2d}  and  so  on.  Generally, 
the  possible  distances  of  its  nth-hop  neighbors  are  {\/(n  —  i)2  +  i2d,i  =  0, 1,2,  ,  [n/2]}, 

where  [n/2]  is  the  smallest  integer  not  less  than  n/2.  If  we  compare  the  given  distance  with 
these  distances,  the  required  number  of  hops  can  be  easily  found.  For  some  given  distance,  there 
could  be  two  solutions,  such  as  (8  —  l)2  +  l2  =  (10  -  5)2  +  (10  -  5)2  =  50,  then  we  have  to 
select  the  number  of  hops  with  higher  probability.  For  geographic  approach,  such  conflicts  can 
be  easily  solved  with  loss  of  accuracy.  Thus,  geographic  approach  is  more  efficient  and  accurate 
than  statistical  approach  on  deterministic  topology. 


Fig.  1.  The  nodes  in  a  square  grid  placement.  Only  nodes  within  4  hops  are  shown. 

However,  if  sensor  nodes  are  deployed  in  a  random  fashion,  which  is  the  case  for  most 
potential  application,  the  answer  is  beyond  the  reach  of  simple  geometry.  The  stochastic  nature 
of  the  random  deployment  calls  for  a  statistical  study.  A  natural  and  obvious  estimation  would 
be  dividing  the  distance  by  the  average  inter-node  distance  (i.e.,  the  average  single-hop  distance). 
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However,  such  estimation  may  be  unable  to  provide  the  required  accuracy.  A  probabilistic  study 
is  needed  here,  that  is,  finding  f(H\d),  where  H  is  the  number  of  hops.  Although  the  question 
raised  here  is  not  directly  addressed  before,  a  mirror  problem,  finding  f{d\h),  has  been  well 
studied.  In  [2],  Hou  and  Li  studied  the  2-D  Poisson  distribution  to  find  a  optimal  transmission 
range.  They  found  that  the  hop-distance  distribution  is  determined  not  only  by  node  density 
and  transmission  range  but  also  by  the  routing  strategy.  They  showed  results  for  three  routing 
strategies,  Most  Forward  with  Fixed  Radius,  Nearest  with  Forward  Progress,  and  Most  Forward 
with  Variable  Radius.  Cheng  and  Robertazzi  in  [3]  studied  the  one-dimension  Poisson  point  and 
found  the  pdf  of  rt  as 

\e-\(R-n) 

where  R  is  the  transmission  range,  A  is  the  node  density,  r,  is  the  distance  from  the  source  to 
a  ith-hop  point  and  rt  is  related  to  rei  by 

rei  +n  =  R.  (2) 


The  pdf  of  rei  is  also  obtained, 


frAret)  — 


Ae 


— Are.- 


1  —  g  V-ft  r«i-i ) 


(3) 


Obviously,  the  distribution  of  rt  depends  on  previous  Tj,  j  <  i.  They  also  pointed  out  the  2-D 
Poisson  point  distribution  is  analogous  to  the  1-D  case,  replacing  the  length  of  the  segment  by 
the  area  of  the  range. 

Vural  and  Ekici  reexamined  the  study  under  the  sensor  networks  circumstances  in  [4],  and  gave 
the  mean  and  variance  of  multi-hop  distance.  They  also  proposed  to  approximate  the  multi-hop 
distance  using  Gaussian. 

The  rest  of  this  paper  is  organized  as  follows.  We  provide  some  preliminaries  on  skewness 
and  kurtosis  in  Sectionsect:Preliminaries.  The  number  of  hops  predication  problem  is  addressed 
and  solved  in  Section  III.  Since  this  problem  has  no  closed-form  solution,  we  propose  an 
attenuated  Gaussian  approximation  and  show  how  to  simplify  the  error  analysis  in  Section  IV. 
An  application  example  is  shown  in  Section  V.  Section  VI  concludes  this  paper. 
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II.  Preliminaries  : Skewness  and  Kurtosis 


In  this  section,  we  provide  some  preliminaries  on  statistical  methods  [5].  Skewness  is  a  measure 
of  symmetry,  or  more  precisely,  the  lack  of  symmetry.  A  distribution,  or  sample  set,  is  symmetric 
if  it  looks  the  same  to  the  left  and  right  of  the  center  point. 

Definition  1:  [5]  For  a  given  sample  set  X, 

m3  =  E(. X-Xf/n ,  (4) 

m2  =  E  (X-X)2/n,  (5) 


where  X  is  the  sample  mean  of  X,  and  n  is  the  size  of  X.  Then  a  sample  estimate  of  skewness 
coefficient  is  given  by 


Si  =  — •  (6) 

ml 

Skewness  is  zero  for  a  symmetric  distribution.  Positive  skewness  indicates  right  skewness  and 
negative  indicates  left. 

Kurtosis  is  a  measure  of  whether  the  data  are  peaked  or  flat  relative  to  a  normal  distribution. 
Definition  2:  [5]  A  sample  estimate  of  kurtosis  for  a  sample  set  X  is  given  by 


g2  =  mAfm\  -  3,  (7) 

where  m4  =  S(X  —  Xfi/n  is  the  fourth-order  moment  of  X  about  its  mean. 

Skewness  and  kurtosis  is  useful  in  determining  whether  a  sample  set  is  normal.  Note  that  the 
skewness  and  kurtosis  of  a  normal  distribution  are  both  zero;  significant  skewness  and  kurtosis 
clearly  indicate  that  data  are  not  normal. 


III.  The  Number  of  Hops  Prediction 

A.  Problem  Formulation 

We  make  the  following  assumptions. 

•  The  nodes  are  deployed  at  random  on  a  plan,  that  is,  the  node  distribution  follows  2D 
Poisson  random  process.  Thus,  the  probability  of  “there  is  no  node  in  a  given  area  A ”  is 
given  by  [6] 

Pr(No  nodes  in  A)  =  e~XA,  (8) 

where  A  is  the  density  of  nodes. 
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•  The  distance  from  the  source  to  the  destination  d  is  known,  which  is  common  in  geographic 
routing. 

.  Neither  of  the  source  and  destination  is  close  to  the  border.  This  assumption  holds  true  for 
most  of  the  nodes  if  the  network  size  is  large  enough. 

The  problem  of  interest  is  to  find  the  number  of  hops,  denoted  H  needed  to  reach  a  specific 
destination  r  from  a  given  source  node.  We  can  make  a  Maximum  Likelihood  (ML)  decision, 

H  =  &vgf(H\r),H  =  1,2,3,  •••  .  (9) 

max 

Considering 

f(H\r)  =  1,  (10) 

the  decision  rule  can  be  translated  into 

H  =  arg/(tf,r),  (11) 

max 

where  f(H,r )  is  also  called  objective  function.  In  the  next  subsection,  we  are  concerned  with 
deriving  p(H,  r). 

90 


180 


270 

Fig.  2.  Poisson  node  distribution. 
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B.  Derivation  of  the  Joint  PDF  p(H,  r) 

Let  r  denote  the  distance  from  the  source  to  a  node,  the  cdf  of  d  is 

Fr(r)  =  1  -  e~W. 

And  the  pdf  of  d  is 

fr{r)  =  A2tt  re-w. 


When  H  —  1,  the  joint  cdf  of  (H,  ri) 

p(H  =  l,r  i)  = 

and  the  joint  pdf  is 


1  _  g-ATrrf  n<R 

0  ri  >  R 


A2-7rrie  Xirri  r\  <  R 

f(H  =  l,n)=l  (15) 

0  rx>  R 

Note  that  the  conditional  pdf  of  H  =  1  given  r  <  R  is  unity  for  r  <  R,  which  is  intuitively 
correct  but  simple,  we  are  more  interested  in  multi-hop  distance.  In  the  following,  r  >  R  is 
assumed  so  that  H  >  1.  The  two-hop  case  is  shown  in  Fig.  3,  a  second-hop  node  must  satisfy 


Fig.  3.  The  second-hop  coverage. 
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R  <  r 2  <  2 R.  Furthermore,  the  farthest  first-hop  node  is  not  necessarily  at  the  maximum 
transmission  range,  which  means,  there  is  a  gap  rei  between  R  and  ri,  i.e., 


rei  +  ri  =  R. 


Therefore,  the  joint  pdf  of  ( Hi,rei )  is 

A27r(i?  —  rei)e_A,r('R_r'!i^2  rCl  <  R 
0  O.W. 

And  accordingly,  the  joint  cdf  of  (H2,r2)  is 


f(H  =  l,rei)={ 


(16) 


(17) 


p(H  =  l,r2|n)  = 

e-A7r[(fl+n)2-(r2+ri)2]  _  g-ATrlffi+n)2-^]  R  <  r  <  2R 
<  (18) 
0  O.W. 

.. 

Generally,  for  H  =  n  (shown  in  Fig.4),  we  have 


and 


p{Hn,rn\ri,r2,  ■ 


'  -Air[( fl+Eln)2-(I>i)2] 
e  *=1  i=1 


,  rn  -  1)  = 

-A  7T[(H+I>i)2-('£’-,?] 


i=l  i=l 


R  <  r  <  nR 
O.W. 


p(Hn,  r„)  = 

(n—l)R(n—2)R  R 

l  l  "  '  j V(Hn'rn\rUr2' 

R  R  0 

/(iTn_i,r„_i|n,r2,---  ,r„  —  2)  • 


•  ,  rn  -  1) 


(19) 


/(iJi,  ri)dri  •  •  •  drn-2dr n-i 


(20) 


Theoretically,  we  can  take  derivative  of  (20)  with  respect  to  r  to  obtain  the  objective  function, 
use  (1 1)  to  decide  the  most  likely  H  given  r  and  give  the  probability  of  error  for  such  a  decision. 
However,  (20)  is  awkward  to  evaluate  and  the  computational  cost  could  limit  the  applicability 
of  such  a  decision  scheme. 
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Fig.  4.  The  ith-hop  coverage. 


TABLE  I 

Statistics  of  f(H  =  n,  r„),  n  >  3 


Number  of  Hops 

Mean 

Std 

Skewness 

Kurtosis 

1 

19.991 

7.0651 

-0.57471 

-0.58389 

2 

45.132 

7.8365 

-0.16958 

-1.0763 

3 

72.01 

8.2129 

-0.10761 

-1.0332 

4 

99.45 

8.391 

-0.07938 

-0.97857 

5 

127.14 

8.5323 

-0.06445 

-0.93104 

6 

154.96 

8.6147 

-0.05341 

-0.9004 

7 

182.68 

8.573 

-0.07738 

-0.91687 

IV.  Attenuated  Gaussian  Approximation 

Since  (20)  is  awkward  to  evaluate  even  using  numerical  methods,  we  use  histograms  collected 
from  Monte  Carlo  simulations  as  substitute  to  the  joint  pdf.  All  the  simulation  data  are  collected 
from  such  a  scenario  that  N  sensor  nodes  were  uniformly  distributed  in  a  circular  region  of  radius 
of  300  meters.  For  convenience,  polar  coordinates  were  used.  The  source  node  was  placed  at 
(0, 0).  The  transmission  range  was  set  as  R  meters.  For  each  setting  of  ( N ,  R),  we  ran  300 
simulations,  in  each  of  which  all  nodes  are  re-deployed  at  random.  And  the  node  density  is 
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given  by 


Fig.  5.  Histograms  of  hop-distance  joint  distribution.  ( R  =  30,  \  =  6.37(10)  3) 


The  histograms  of  f(H,  r )  are  plotted  in  Fig.  5,  which  clearly  shows  that  the  joint  distribution 
of  (H,  r )  approach  the  normal  when  H  increases.  Table  I  lists  the  first-,  second-,  third-  and  fourth- 
order  statistics  of  f(H,  r).  The  skewness  and  kurtosis  clearly  satisfy  the  Gaussianity  condition 
within  tolerance  of  error.  Thus,  the  objective  function  can  be  approximated  by 


where  a  is  the  equivalent  attenuation  base,  mn  and  an  are  the  mean  and  standard  deviation(std), 
respectively.  The  specific  values  of  these  parameters  can  be  evaluated  from  (20)  numerically  or 
estimated  from  simulations.  Observe  Table  I,  for  large  n,  the  joint  pdf  of  ( H ,  r )  has  following 
properties, 


1)  an  w  cr„_ i,  which  means  the  neighboring  joint  pdf’s  have  similar  spread. 

2)  mn  —  m„_ i  «  mn+i  —  mn,  which  means  the  joint  pdf’s  are  evenly  spaced. 
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3)  3  <  TOn  mn~1  <  5,  which  means  the  overlap  between  the  neighboring  joint  pdf’s  is  small 
but  not  negligible.  (As  a  rule  of  thumbs,  Q(3)  is  considered  relatively  small  and  Q( 5)  is 
regarded  negligible.) 

4)  5;  which  means  the  overlap  between  the  non-neighboring  joint  pdf’s  is 
negligible. 

5)  a  <  1.  For  large  density  A,  a  — >  1.  Along  with  Property  1,  this  tell  us  that  the  neighboring 
joint  pdf’s  have  nearly  identical  shape. 

As  shown  in  the  following  discussion,  these  properties  largely  simplify  the  decision  rule  and 
the  error  analysis.  Another  interesting  observation,  besides  these  properties,  is  that  the  following 
equations  do  not  stand  true. 


mn 

—  nm\ 

(23) 

mn 

—  nR 

(24) 

mn 

=  (n  —  1)R  +  R/2 

(25) 

Although  these  equations  sound  plausible,  they  all  give  visible  errors.  The  aforementioned 
estimator  [r/R]  +  1  for  H,  though  widely  used,  is  not  good  in  the  new  light  shed  by  this 
study.  However,  Property  2  does  tell  us  the  increment  for  mn  is  constant,  if  denoted  by  A, 

mn  =  rrii  +  (n  —  1)A  (26) 

We  showed  in  [7]  that  mi  =  2/3 R,  irrelevant  to  the  node  density.  Although  A  is  a  function  of 
A  and  R,  A  is  often  regarded  constant  for  a  specific  application  and  R  varies  in  a  short  range, 
thus,  we  can  safely  expect  A  =  aR,  where  a  is  a  constant,  for  example,  a  =  0.9  for  the  data  in 
Table  I.  In  summary,  the  following  empirical  equation  stands  for  most  application  for  WSN. 

m„  =  R{^  +  (n  -  l)a)  (27) 

The  above  results  about  the  constant  increment  of  mean  hop-distance  is  used  in  Section  V-B  for 
energy  consumption  estimation. 

A.  Decision  Boundaries 

Following  (11),  we  decide  H  given  r  using  the  following  rule. 

H  —  argmaxf(H,  r)  (28) 
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Fig.  6.  Gaussian  Approximation. 


Observe  the  f(Hn,  rn)  in  Fig.  6,  the  decision  is  needed  only  between  neighboring  H,  that  is, 

f(H  =  n,r)  |  f(H  =  n  +  l,r).  (29) 

n+l 

This  is  because,  for  a  specific  value  of  r,  there  are  only  two  Iin  with  dominating  f(H  —  n+l,  r), 
compared  to  which  f(H  =  n  +  l,r)  for  other  values  of  Hn  is  negligible.  Substitute  (22)  into 
(29),  we  obtain  the  decision  boundary  dn  between  the  regions  H  —  n  and  H  —  n  +  1. 

J  B  +  sjB'1  +  AC) 
dn  -  A 

A  - 

B  =  mn&n+i  ~  mn+1al 

C  =  ™Wn+i  +  (30) 


Using  Property  1, 


(J/rj 


ml+i  ~m2n-2alhia 


2(mn+i  -  mn) 

For  large  density  A,  Property  5  is  applicable,  (30)  simplifies  to 

°%mn+ 1  +  o-l+lmn 


dn  — 


2 

n+l 


(31) 


(32) 


11 


Applying  Property  1  to  (32), 


(33) 


rrift  A  Tnn-\-i 


If  we  use  the  empirical  equation  (27), 


2  1 
dn  =  -R  +  {n-  -)aR 

No  matter  which  approximate  solution  we  choose  for  dn,  the  decision  rule  is  given  by 

71+1 

r  ^  dn. 


In  other  words, 


we  decide  H  —  h  if  c4_i  <  r  <  da, 


which  is  equivalent  to 


,r  —  li?  1,  , 
n  =  aR  ^  2  ^ 


B.  Error  Performance  Analysis 

For  out  decision  rule,  a  decision  error  occurs  when  H  =  nf  h.  Thus,  the  probability  of  error 
with  a  specific  r  is 

p(e,r)  =  (38) 

n^n 

The  total  probability  of  error  is  obtained  by  integrating  (38)  over  all  possible  r. 


P(e)  =  J  P(e,  r)dr 


According  to  Property  4,  only  f{n  —  1.  r)  and  f(n  + 1,  r)  could  have  outstanding  value  over  the 
decision  region  [dn_i,d„]. 


OO  n» 

p(e)  ~  ^  J  f{n-l,r)  +  f{n+l,r)dr 


=  y>a7t-1[Q(dn~1—  mn~-)  -  Q(—  m— )] 

9  &n—l  ®n— 1 

n=2 

+an+1[Q(™n+1  ~n)  -  Q(mn+l-dn'±)} 

&n+l  an+ 1 
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Note  that 


dn  TTln—i  dn— i  771  n— i 

®n— 1  ®n— 1 

dn  dn — i  ^  1 

^n— 1 

f  dri^-Trin—l  \  *0  4-r*  r\(  dn—l—mn- 


(41) 


therefore,  Q(-’o."-‘  r)  is  negligible  compared  to  Q(  -  *  1).  Similarly,  Q(m^fl+i  -)  is  negli¬ 

gible.  (40)  is  approximated  by 

m  «  csq(^^) + f;|a--q(^±z ) 


^3 


n— 3 


^Vi— 1 


+an+1Q(mn+1  dra)] 

^n+l 

=  o?Q{^^)  +  fV[Q(m"  ~  dn~1) 

CT2  t'z  CT” 

+Q(^— ^)] 

(42) 

Substituting  an  appropriate  solution  of  dn  into  (42)  would  give  us  the  probability  of  error  within 
required  accuracy.  For  example,  if  we  choose  (33), 

,TCln  1 ' 

*"1^1 

n= 3 


2cr„ 


:) 


+Q(--n-+01  mn)] 


V.  Application  Examples 


(43) 


^4.  Latency  Estimation 


Ir 

Fig.  7.  Time  model. 
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Suppose  it  takes  Trx  for  a  sensor  node  to  receive  1  bit  of  message  and  Ttx  to  transmit. 
Considering  the  transmission  range  in  sensor  networks  is  usually  short  compared  to  the  light 
speed,  the  propagation  time  Tpr  is  negligible.  Shown  in  Fig.  7,  given  the  end-to-end  distance  r, 
we  can  find  the  required  number  of  hops  H  —  n  according  to  (35),  thus,  a  good  estimator  of 
the  total  latency  of  a  Z-bit  message  is 

l[Ttx  +  (n  —  l)(Ttx  +  Trx )  +  Trx]  (44) 

=  ln(Ttx  +  Trx)  (45) 


B.  Energy  Consumption  Estimation 


The  following  model  is  adopted  from  [8]  where  perfect  power  control  is  assumed.  To  transmit 
l  bits  over  distance  d,  the  sender’s  radio  expends 


Etx{l,  d) 


and  the  receiver’s  radio  expends 


IE eiec  -}-  Idfgd  d  <1  do 

< 

l  Eelec  l^-mpd  d  ^  cZq 


Erx ( l :  d)  —  lEelec- 


(46) 


(47) 


Eeiec  is  the  unit  energy  consumed  by  the  electronics  to  process  one  bit  of  message,  e/s  and 
emp  are  the  amplifier  factor  for  ffee-space  and  multi-path  models,  respectively,  and  d0  is  the 
reference  distance  to  determine  which  model  to  use.  The  values  of  these  communication  energy 
parameters  are  set  as  in  Table  II. 


TABLE  II 

Energy  Consumpton  Parameters 


Name 

Value 

do 

86.2m 

Belec 

50nJ/bit 

Eda 

5nJ/bit 

lOpJ/bit/m 2 

6mp 

0.0013 pj/bit/nd 

14 


Let  sn  denote  the  single-hop  distance  from  the  (n  -  l)th-hop  to  the  nth-hop.  Obviously, 
sn  <  R.  In  our  experimental  setting,  R  —  30 m  <  d0  so  that  the  free  space  model  is  always 
used.  This  agrees  well  with  most  applications,  in  which  multi-hop  short-range  transmission  is 
preferred  to  avoid  the  exponential  increase  in  energy  consumption  for  long-range  transmission. 
Naturally,  the  end-to-end  energy  consumption  for  sending  1  bits  over  distance  r  is  given  by 

Tl 

Etotai{hr)  =  YlEtx(l,r)  +  Erx(l)} 

1 

n 

—  ^  '{Eelec  +  efsSn  +  Relec},  (48) 

1 

where  n  is  the  decision  result  for  given  r.  On  the  average, 

n 

Eelec  €fs  (49) 

1 


Fig.  8.  The  relationship  between  r„,rn_i  and  sn 
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The  relationship  between  and  sn  is  depicted  in  Fig.8. 


t 

cos  A 

:.sl  =  r: 


sn  cos  A 

sl  +  rl-  r*_ i 


^SvTn 


n—  1 


rn  +  2  trn 


(50) 


For  large  n,  r„  S>  sn  and  rn_i  »  s„,  therefore,  t  — >  A.  According  to  Property  2,  A  can  be 
treated  as  a  constant. 


E[s2n]  « 

(•_•  Property  1)  « 
Substitute  (52)  into  (49), 


£[rti]-£[a  +  2A£[rn] 

^n-l  -  +  2Amn 

2A  mn 


h 

Etotai(J')i’')  ~  ZnlEeiec  +  2e/gA^mn 

i 


(51) 

(52) 


(53) 


VI.  Conclusion 

To  predict  the  number  of  hops  II  needed  to  reach  a  given  distance  r  in  randomly  deployed 
sensor  networks,  we  proposed  a  ML  decision  based  on  the  joint  pdf  of  ( H,n ),  which  was  also 
derived  in  this  paper.  Since  the  solution  is  not  closed-form,  we  also  proposed  an  attenuated 
Gaussian  approximation  for  the  joint  pdf.  We  show  that  the  approximation  visibly  simplifies  the 
decision  process  and  the  error  analysis.  The  latency  and  energy  consumption  estimation  are  also 
included  as  application  examples. 
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Abstract 

In  this  paper,  we  are  concerned  with  the  optimal  cluster  size  in  Underwater  Acoustic  networks.  Due 
to  the  sparse  deployment  and  channel  property,  the  clustering  characteristics  of  UA  is  different  from  that 
of  aerial  sensor  networks.  We  show  that  the  optimal  cluster  size  is  also  relevant  to  the  working  frequency 
of  the  acoustic  transmission.  Furthermore,  we  show  that  assigning  working  frequency  to  cluster  members 
according  to  their  distances  to  the  cluster  head  could  minimize  the  energy  consumption. 

I.  Introduction  and  Motivation 

An  UnderWater  Acoustic  Sensor  Network  (UW-ASN)  can  be  thought  of  as  an  ad  hoc  network 
consisting  of  sensors  linked  by  an  acoustic  medium  to  perform  distributed  sensing  tasks.  To 
achieve  this  objective,  sensors  must  self-organize  into  an  autonomous  network  which  can  adapt 
to  the  characteristics  of  the  underwater  environment.  UW-ASNs  share  many  communication 
technologies  with  traditional  ad  hoc  networks  and  terrestrial  wireless  sensor  networks,  but  there 
are  some  vital  differences  such  as  limited  energy  and  bandwidth  constraint  [1],  thus  the  protocols 
developed  for  traditional  wireless  ad  hoc  networks  are  not  necessarily  well  suited  to  the  unique 
features  of  WSNs.  When  a  wireless  sensor  may  have  to  operate  for  a  relatively  long  duration 
on  a  tiny  battery,  energy  efficiency  becomes  a  major  concern. 

Another  issue  in  shallow  water  communications  is  that  due  to  the  limit  of  bandwidth  in 
shallow  water  communications,  multi-hop  communication  could  introduce  heavy  interference 


between  cluster  members,  therefore,  each  sensor  in  a  cluster  communicate  directly  to  its  cluster 
head  and  intra-cluster  communication  should  be  coordinated  by  the  cluster  head  in  order  to 
maximize  the  bandwidth  usage. 

The  remainder  of  this  paper  is  organized  as  follows. 

II.  Preliminaries 

In  this  section,  we  provide  some  preliminaries  needed  for  further  discussion. 

A.  Underwater  Acoustics  Fundamentals 

Based  on  the  data  and  formulas  in  [2],  Jurdak,  Lopes  and  Baldi  [3]  derived  the  following 
model, 

SL  =  TL  +  85,  (1) 

where  SL  is  the  source  level  and  TL  is  the  transmission  loss.  All  the  quantities  in  (1)  are  in 
dB  re  fj,Pa ,  where  the  reference  value  of  1  j iPa  amounts  to  0.67  x  10 ~22W atts / cm2 .  For 
cylindrically  spread  signals,  the  transmission  loss  is  approximated  by  [2], 

TL  =  10  log  d  +  ad  x  10“ 3,  (2) 

where  d  is  the  distance  betwen  source  and  receiver  in  meters,  a  is  the  frequency  dependent 
medium  absorption  coefficient.  Fisher  and  Simmons  [4]  measured  the  medium  absobtion  in 
shallow  seawater  at  temperatures  at  4°C  and  20°C.  The  average  is  obtained  in  [3], 

0.0601  x  /°-8552  1  <  /  <  6 

9.7888  x  /1J885  x  10"3  7  <  /  <  20 

(3) 

0.3026  x  /  -  3.7933  20  <  /  <  35 

0.504  x/-  11.2  35  <  /  <  50. 

To  guarantee  the  reception  quality,  the  required  threshold  of  a,  denoted  by  a,  might  be  chosen 
larger  than  a.  However,  we  can  generally  expect  a  be  a  monotonically  decreasing  function  of 
frequency  /.  To  emphasize  their  relationship,  a  is  written  as  a(f)  in  the  rest  of  this  paper. 
The  transmitter  power  Pt  required  to  achieve  an  intensity  It  at  a  reference  distance  of  Ira  is 
expressed  as, 

Pt  =  2?r  x  lm  x  U  x  Iu  (4) 
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where  It  is  related  to  SL  by 


It  =  10SL/1°  x  0.67  x  10-18.  (5) 

Summing  up  (1),  (2),  (4)  and  (5),  we  obtain 

Pt  =  CHdlOadw~\  (6) 

where  C  is  a  constant  equaling  27r(0.67)10-9'5.  Therefore,  to  transmit  l  bits  over  distance  d,  the 
sender’s  radio  expends 

Etx(1 !  d)  =  lEelec  +  lTbPt  (7) 

and  the  receiver’s  radio  expends 

ERx(l,d)  =  lEetec,  (8) 

where  Tb  the  bit  duration,  EPjec  is  the  unit  energy  consumed  by  the  electronics  to  process  one 
bit  of  message. 

III.  Optimal  Clustering 

In  this  section,  we  make  data-centric  analysis  of  energy  consumption  in  UW-ASN. 

A.  Problem  Formulation 

Clustering  has  been  widely  used  in  pattern  recognition,  and  we  use  it  to  obtain  the  energy- 
efficient  organization  for  UW-ASN.  Consider  a  heterogeneous  UW-ASN,  in  which  the  low- 
capacity  sensors  serves  as  cluster  members  and  are  randomly  distributed,  and  the  high-capacity 
sensors  serve  as  cluster  heads  and  are  manually  positioned.  If  we  obtain  the  optimal  cluster 
size,  then  the  required  number  of  high-capacity  sensors  and  their  ideal  positions  can  also  be 
determined,  all  the  sensor  are  the  same  and  elect  some  from  them  to  be  cluster  heads.  If  we 
assume  the  high-capacity  sensors  have  virtually  unlimited  energy  reserve  compared  to  the  low- 
capacity  ones,  only  the  energy  consumption  of  the  low-capacity  sensors  need  to  be  counted.  The 
energy  cost  for  each  bit  of  data  collected  by  the  ith  member  of  the  kth  cluster  is 

EcM(ki)  —  Eeiec  +  TbPt{ki)  (9) 

Pt(ki )  =  CHrkilOarkiW~3  (10) 

where  rki  is  the  distance  from  the  kth  cluster  head  and  its  7th  member. 
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Considering  all  c  clusters,  the  overall  cost  is 

c  Mk 

Etotal  —  y  y.ecmm,  (ii) 

fe=i  j=i 

where  A-4  is  the  number  of  non-head  members  in  the  k  cluster. 

Taking  the  expected  value  of  the  overall  energy  cost,  we  obtain  Etotai  as  the  objective  function. 
Minimizing  the  overall  energy  consumption  is  equivalent  to  minimizing 

J  =  E[rlOar].  (12) 

B.  Solution  for  Random  Deployment 

Suppose  the  low-capacity  sensors  are  deployed  at  random,  then  their  locations  would  follow 
the  two-dimension  Poisson  distribution,  i.e.,  the  number  of  nodes  NA  in  area  A  is  given  by, 

Pr(NA)  =  (A  A)NAe~XA/NA\,  (13) 


where  A  is  the  node  density.  A  useful  property  of  the  Poisson  process  is  that  if  the  number  of 
nodes  occurring  in  the  area  A  is  N,  then  the  individual  outcomes  of  N  nodes  are  distributed 
independently  and  uniformly  in  the  area  A.  For  the  single-hop  cluster,  in  which  all  cluster 
members  can  communicate  with  the  cluster  head  directly,  the  distance  r  between  any  cluster 
member  to  the  cluster  head  has  the  cdf  given  by 


F(r)  = 


nr 
nR 2  ’ 


where  R  is  the  cluster  size.  Thus  the  pdf  of  r  is 


(14) 


(15) 


Suppose  the  frequency  allocation  is  irrelevant  to  r,  which  is  the  case  for  most  applications  in 
use,  a(f)  and  r  is  independent.  Substitute  (15)  into  (12), 


£[r!0r] 


J  =  £[rlOr]10Q. 

R 

f  2  r 

/rt  V 

o 

flOr(— _ 2r  .  2  )]R 

[  U  MnlO  In2  10  +  In3  HT Jo 

R2  2 R  2  ,  2 

m  10  In2 10  In3 10;  In3 10 


(16) 


(17) 
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By  setting  the  derivative  of  (17)  to  zero,  we  obtain 

Ropt  =  V2/lnl0.  (18) 

IV.  Conclusion 

Although  clustering  has  been  well  studied  for  the  terrestrial  WSN,  the  unique  characteristics 
of  the  underwater  acoustic  communications  call  for  a  new  study.  Because  the  path  loss  is  not 
only  relevant  to  the  distance,  but  also  related  to  the  working  frequency,  the  optimal  cluster  size 
for  UW-ASN  shows  different  properties  from  the  terrestrial  WSN.  Furthermore,  we  show  that 
assigning  working  frequency  to  cluster  members  according  to  their  distances  to  the  cluster  head 
could  minimize  the  energy  consumption. 
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Abstract — Query  processing  methods  have  been  studied  ex¬ 
tensively  in  traditional  database  systems.  But  few  of  them  can 
be  directly  applied  into  sensor  database  systems  due  to  the 
characteristics  of  sensor  networks:  decentralized  nature  of  sensor 
networks,  limited  computational  power,  imperfect  information 
recorded,  and  energy  scarcity  of  individual  sensor  nodes.  In 
this  paper,  we  propose  a  quality-guaranteed  and  energy-efficient 
algorithm  (QGEE)  for  sensor  database  systems.  We  employ  an 
in-network  query  processing  method  to  task  sensor  networks 
through  declarative  queries.  Given  a  query,  our  QGEE  will 
adaptively  form  an  optimal  query  plan  in  terms  of  energy 
efficiency  and  query  quality.  The  goal  of  our  approach  is  to 
reduce  interference  coming  from  measurements  with  extreme 
errors  and  to  minimize  energy  consumption  by  providing  service 
that  is  considerably  necessary  and  sufficient  for  the  requirement 
of  applications.  Moreover,  we  employ  probabilistic  method  to 
formulate  the  distribution  of  imperfect  information  sources  in 
terms  of  probability  distribution  function  (PDF),  and  acquire 
probabilistic  query  answers  on  uncertain  data.  The  probability 
to  an  answer  allows  users  to  place  appropriate  confidence  in 
it.  The  simulation  results  demonstrate  that  our  algorithm  can 
reduce  resource  usage  and  supply  quality  satisfied  query  answers 
to  users. 

I.  Introduction  and  Motivation 

Recent  developments  in  integrated  circuit  technology  have 
allowed  the  construction  of  low-cost  sensor  nodes  with  signal 
processing  and  wireless  communication  capabilities.  Wireless 
sensor  networks  (WSNs)  are  generally  consisted  of  a  large 
number  of  sensor  nodes  [2]  operating  under  energy  constraints 
in  unattended  mode,  which  are  capable  of  limited  compu¬ 
tation,  communication  and  sensing.  WSNs  are  intended  for 
a  broad  range  of  environmental  sensing  applications  from 
weather  data-collection  to  vehicle  tracking  and  habitat  moni¬ 
toring  [1]  [3], 

The  widespread  deployment  of  sensor  nodes  is  transforming 
the  physical  world  into  a  computing  platform.  Sensor  nodes 
not  only  respond  to  physical  signals  to  produce  data,  they 
also  embed  computing  and  communication  capabilities.  They 
are  thus  able  to  store,  process  locally  and  transfer  the  data 
they  produce.  From  a  data  storage  point  of  view,  WSN  can 
be  regarded  as  a  kind  of  database,  distributed  sensor  database 
system  (DSDS).  DSDS,  compared  to  traditional  database  sys¬ 
tems,  stores  data  within  the  network  and  allow  queries  to  be 
injected  anywhere  through  query  processing  operators  in  the 
network. 


Existing  query  processing  systems  for  WSNs,  including 
Directed  Diffusion  [4],  TinyDB  [16],  and  Cougar  [5],  provide 
high-level  interfaces  that  allow  users  to  collect  and  process 
such  continuous  streams.  Note  that  they  are  especially  at¬ 
tractive  as  ways  to  efficiently  implement  monitoring  applica¬ 
tions  without  forcing  users  to  write  complex,  low-level  code 
for  managing  multihop  network  topologies  or  for  acquiring 
samples  from  sensor  nodes.  TinyDB,  Directed  Diffusion  and 
Cougar  are  relatively  mature  research  prototypes  that  give 
some  ideas  on  how  future  sensor  network  query  processing 
systems  will  function.  However,  these  future  systems  will  be 
significantly  more  sophisticated  than  any  of  today’s  prototypes. 
To  understand  these  requirements,  queries  may  be  classified 
along  five  dimensions-  scope,  volume,  complexity,  timeliness 
and  quality-those  dictate  the  design  of  networking  mechanisms 
for  query  processing. 

The  goal  of  monitoring  through  sensor  nodes  is  to  infer 
information  about  objects  from  measurements  made  from 
remote  locations.  Since  inference  processes  are  always  less 
than  perfect,  there  is  an  element  of  uncertainty  regarding  the 
answers.  When  viewed  from  this  perspective,  the  problem  of 
uncertainty,  which  stands  for  the  quality  of  query  answers, 
is  central  to  monitoring  applications.  Thus,  to  build  useful 
information  systems,  it  is  necessary  to  leam  how  to  represent 
and  reason  with  imperfect  information. 

In  this  paper,  we  propose  our  solutions  on  query  opti¬ 
mization  and  execution  following  acquisitional  query  process¬ 
ing  (ACQP)  approach  [16],  Compared  with  typical  methods, 
ACQP  focuses  on  betaking  the  significant  new  query  pro¬ 
cessing  opportunity  that  arises  in  sensor  networks:  the  fact 
that  smart  sensors  have  the  capability  to  control  over  where, 
when,  and  how  often  data  is  physically  acquired  (i.e.,  sampled) 
and  delivered  to  query  processing  operators.  Motivated  by 
these,  we  propose  a  quality-guaranteed  and  energy-efficient 
(QGEE)  query  processing  algorithm  for  distributed  and  het¬ 
erogenous  WSNs.  In  the  following  sections,  we  outline  QGEE 
paradigm,  explain  its  key  features,  and  describe  in  some 
detail  using  a  particular  example  of  environmental  temperature 
monitoring.  We  specify  local  rules  achieving,  desired  sensor 
nodes  choosing,  and  query  answers  acquiring  in  terms  of 
bounded  probabilistic  values.  In  doing  so,  we  show  how 
QGEE  paradigm  differs  from  existed  query  processing  systems 
and  qualitatively  argue  that  this  paradigm  offering  scaling, 


robustness  and  energy  efficiency  benefits.  We  quantify  some  of 
these  benefits  via  detailed  packet-level  simulations  on  QGEE. 

The  remainder  of  this  paper  is  organized  as  follows:  in 
Section  II,  we  provide  some  preliminaries  on  vector  space 
model  and  k-partial  set  cover  problem;  Section  III  formulates 
the  problems  we  considered  for  our  algorithm;  Section  IV 
presents  our  QGEE  algorithm;  simulation  results  are  given  in 
Section  V;  Section  VI  concludes  this  paper. 


II.  Preliminaries 


A.  Vector  Space  Model 

Vector  Space  Model  (VSM)  [17]  [18]  is  a  way  to  represent 
documents  through  words  that  they  contain.  VSM  has  been 
widely  used  in  the  traditional  information  retrieval  (IR)  field 
[19]  [20],  Most  search  engines  also  use  similarity  measures 
based  on  this  model  to  rank  Web  documents.  VSM  creates  a 
space  in  which  both  documents  and  queries  are  represented  by 
vectors.  For  a  fixed  collection  of  documents,  an  m-dimensional 
vector  is  generated  for  each  document  and  each  query  from 
sets  of  terms  with  associated  weights.  Then,  a  vector  similarity 
function,  such  as  the  inner  product,  can  be  used  to  compute 
the  similarity  between  a  document  and  a  query. 

In  VSM,  weights  associated  with  the  terms  are  calculated 
based  on  the  following  two  numbers: 

<  term  frequency,  fy,  the  number  of  occurrence  of  term  yt 
in  document  xp,  and 

•  inverse  document  frequency,  gt  —  log(N/df,  where  N 
is  the  total  number  of  documents  in  the  collection  and  d3 
is  the  number  of  documents  containing  term  y,:. 

The  similarity  simvs{q,  Xi)  between  a  query  q  and  a  document 
Xi  can  be  defined  as  an  inner  product  of  query  vector  Q  and 
document  vector  Xp. 


simvs(q,  x^  =  Q  ■  Xi  = 


(1) 


where  m*  is  the  number  of  unique  terms  in  the  document 
collection.  Document  weight  Wij  and  query  weight  Vj  are 


Wij  —  fijWij  —  fijlog(Nfdj) 
log(N/dj)  yj  is  a  term,  in  q 
0  otherwise. 


B.  k-Partial  Set  Cover  Problem 

Covering  problems  are  widely  studied  in  discrete  optimiza¬ 
tion.  Basically,  these  problems  involve  picking  a  least-cost 
collection  of  sets  to  cover  elements.  Classical  problems  in 
this  framework  include  general  set  cover  problems  and  partial 
covering  problems,  k-partial  set  cover  problem  [21]  as  a  partial 
covering  problem  is  about  how  to  choose  a  minimum  number 
of  sets  to  cover  at  least  n  elements,  and  which  k  elements 
should  be  chosen. 

k-partial  set  cover  problem  can  be  formulated  as  an  integer 
program  as  following. 

MINIMIZE: 


m 

£4S;)-^  (3) 

j= i 

SUBJECT  TO: 


Hi  “b  ^  ^  Xj  ^  1  i  1,2, ... ,11, 

(4) 

£  2/i  <  n  -  fc, 

(5) 

i=l 

Xj>  0  j  =  1,2, ...  ,171, 

(6) 

Vi>0  j  —  1>  2, . . . ,  ra, 

(7) 

Where  £*£{0,1}  corresponds  to  each  S3eS.  Iff  set  Sj  belongs 
to  the  cover,  then  Xj  =  1.  Iff  set  t3  is  not  covered,  then  y:  =  1. 
her. 

III.  Problem  Formulation 

As  a  motivation  for  our  quality-guaranteed  and  energy- 
efficient  query  processing,  we  describe  a  scenario: 

•  A  great  multitude  of  temperature  sensor  nodes  are  ran¬ 
domly  deployed  in  a  region  we  interested.  Individual 
sensor  nodes  (or  in  short,  nodes)  is  connected  to  other 
nodes  in  its  vicinity  through  a  wireless  communication 
interface,  and  it  uses  a  multihop  routing  protocol  to 
communicate  with  nodes  that  are  spatially  distant.  All 
nodes  are  interconnected  to  at  least  one  gateway  directly 
or  through  intermedial  nodes.  Gateways  are  in  charge  of 
relaying  data  to  a  powered  PC  (front-end  node)  and,  on 
the  opposite  direction,  disseminating  queries  to  related 
nodes.  Within  this  sensor  network,  each  node  is  provided 
with  equal  computing  and  sensing  capability,  but  mea¬ 
surement  quality  may  be  different. 

This  scenario  involves  such  a  region-based  query: 

•  Environmental  Temperature  Monitoring:  With  p  confi¬ 
dence,  tell  the  average  temperature  of  nodes  in  the  region 
defined  by  a  rectangle  ( a,b,c,d ). 

Written  in  SQL-like  language  [15],  this  query  is  shown  in 
Fig.  1. 

SELECT  AVG(temp) 

FROM  sensors 

WHERE  loc  in  (a,b,c,d)  AND  PROB>=p 

SAMPLE  PERIOD  100  seconds; 

Fig.  1.  Average  Temperature  Query  in  SQL  Form 

A.  Source  of  Imperfect  Information 

Imperfect  information  is  ubiquitous  (almost  all  information 
that  we  have  about  the  real  world  is  not  certain,  complete 
or  precise).  In  many  occasions  imperfect  information  can 
be  classified  into  uncertainty,  incompleteness,  ambiguity  and 


imprecision  proposed  by  Bonnissione,  etc.  [12],  [14].  Incom¬ 
pleteness  arises  from  the  absence  of  values,  imprecision  from 
the  existence  of  values  which  cannot  be  measured  with  suitable 
precision,  ambiguity  from  vague  statement,  and  uncertainty 
from  the  fact  that  an  agent  has  constructed  a  subjective  opinion 
about  the  truth  of  a  fact  which  it  does  not  know  for  certain. 

In  the  context  of  our  analyzing  and  understanding  of 
query  answer  uncertainty,  a  significant  challenge  is  how  to 
understand  the  nature  and  the  source  of  uncertainty  well  on 
temperature  information  derived  from  remote  sensing.  Image 
chain  approach  [6]  is  one  of  the  most  important  and  useful 
models  for  remote  sensing  processing.  Image  chain  identifies 
steps  in  remote  sensing  process  (or  links  in  the  chain)  and 
illustrates  that  these  steps  are  interrelated. 


Image  Chain  for 


Fig.  2.  Image  Chain  Model 


exhibit  sensitivity  variation  similar  to  what  is  shown  in 
Figure.  3.  Note  that,  nodes  are  more  sensitive  to  the  the 
center  of  their  regions  than  toward  the  edge. 


(a)  (b) 


Fig.  3.  (a)  1-Dimension  Gaussian  model  of  a  PSF  and  (b)  2-Dimension 

Gaussian  model  of  a  PSF 

•  Link  Quality:  Packets  losing  due  to  poor  link  qual¬ 
ity  introduces  incompleteness  into  query  answers.  The 
dynamic  and  lossy  nature  of  wireless  communication 
pose  major  challenges  to  reliable,  self-organizing  WSNs. 
Packets  transmission  failures  may  happen  during  data 
transmission  because  of  collision,  node  dying  out  (no 
battery),  node  being  busy,  or  node’s  mobility.  Moreover, 
in  physical  layer,  sensor  mobility  generates  channel  fad¬ 
ing  during  data  transmission,  which  degrades  the  perfor¬ 
mance  in  terms  of  bit  error  rate  (BER)  and  frame  error 
rate. 


Viewing  the  working  process  of  our  temperature  monitoring 
application,  there  are  three  kinds  of  imperfect  information 
source  which  contribute  greatly  to  the  uncertainty  of  query 
answers.  They  are  measurement  quality,  point  spread  ftmction 
(PSF)  of  nodes  and  link  quality.  Fig.  2  illustrates  our  image 
chain  model.  Links  in  the  chain  represent  various  steps  in 
the  remote  sensing  process  from  nodes  collecting  information 
related  to  environment  (Input),  flowing  data  records  back  to 
related  front-end  nodes  (Collection)  at  run-time,  to  obtaining 
query  answers  through  processing  all  collected  information  at 
front-end  nodes  (Output). 

■  •  Measurement  Quality  of  Nodes:  Measurement  quality  of 
nodes  introduces  uncertainty  and  imprecise  information 
into  query  answers.  As  we  know,  the  quality  of  nodes’ 
sensing  parts  usually  boils  down  to  their  measurement 
stabilities  and  measurement  accuracies.  In  general,  as 
measurement  stability  and  accuracy  increase,  so  do  their 
power  requirements  and  cost,  which  are  all  troublesome 
for  general  sensor  nodes.  Therefore,  inaccurate  mea¬ 
surements  supplied  by  sensor  nodes  are  very  common 
phenomenons. 

•  Point  Spread  Function  (PSF)  of  Nodes:  PSF  of  nodes  in¬ 
troduces  ambiguity  into  query  answers.  Our  temperature 
monitoring  application  is  interested  in  the  temperature 
over  a  region  instead  of  one  point  in  space.  But  consider¬ 
ing  operation  feasibility,  cost  and  speed,  sampling  method 
is  widely  used  instead  of  completely  measuring.  In  this 
aspect,  another  imperfect  information  source  is  raised: 
PSF  of  nodes.  PSF  is  caused  by  nonuniform  sensitivity 
within  the  region  associated  with  nodes.  Most  nodes 


Fixing  other  conditions,  such  as  node  density,  communica¬ 
tion  range,  sensing  range  and  network  coverage,  we  change 
measurement  quality,  PSF  and  link  quality  separately  to  eval¬ 
uate  the  influences  of  those  imperfect  information  sources 
on  the  correctness  of  query  answers.  The  results  are  given 
in  Table  I.  Note  that  with  measurement  errors,  misrepresent 
errors,  or  missing  information  increased,  the  errors  included 
in  query  answers  are  obviously  increased  and  therefore  the 
confidence  of  query  answers  is  reduced. 

table  I 

Using  Root  Mean  Square  of  Error  (RMSE)  to  quantify  the 

ERRORS  OF  QUERY  ANSWERS.  LBR  STANDS  FOR  LINK  BREAK  RATE. 


MeasureQuality  | 

PSF  (cfM)  ] 

LinkQuality  | 

0=0.01 

o=0.1 

i=1.65m 

r=l. 96m 

LBR=0. 1 

LBR=0.5 

MAX 

0.2537 

2.5618 

22.882 

25.17 

0.2069 

0.7132 

MIN 

0.2541 

2.5416 

22.905 

25.195 

0.1291 

0.5895 

AVG 

0.0102 

0.0944 

3.0802 

3.5202 

0.0936 

0.3127 

B.  Source  of  Energy  Waste 

Within  a  network,  not  all  available  nodes  provide  useful 
information  that  improves  the  accuracy  of  final  results.  Fur¬ 
thermore,  some  information  might  be  redundant  because  nodes 
closing  to  each  other  would  have  similar  data.  From  this 
prospect,  collecting  raw  readings  from  all  nodes  to  front-end 
nodes  involves  large  amounts  of  raw  readings  which  will  lead 
to  shorter  lifetimes,  especially  for  energy-limited  WSNs. 


For  example,  there  is  a  WSN  with  n  nodes.  We  assume 
that  the  lifetime  for  all  nodes  is  Tufe  and  queries  submit¬ 
ted  to  the  network  are  processed  sequently.  If  there  are  u) 
queries  (Ai,  A?,  ■  ■  ■ ,  and  Au)  relating  to  certain  regions  in 
this  network  ( ui  <  n ),  71  nodes  participate  Ai  relating  to 
area  72  nodes  participate  A2  relating  area  Sa2, 
and  7^  nodes  participate  Aw  relating  to  area  Sa^,  ■  In  this 
case,  node  density  is  Sa'  (7i+72+-  •  •  +7 u=n)  and  the 
lifetime  of  this  network  is  Lt'iW  =  u)Tufe.  Note  that,  with 
the  number  of  queiy  oj  decreasing,  node  density  is  increased 

(-5s->-5 — To — >^wng  ),  and  the  lifetime  of  network  is 
'-Sa1  SAi+Sa2 

decreased  (Lt)1  <  Lt>2  <  Lt^)- 

For  energy  reservation  issues  on  information  collection, 
previous  networking  researches  approach  data  aggregation  [9] 
as  an  application  specific  technique  to  reduce  the  amount  of 
data  which  is  sent  over  the  network.  But  where  the  aggre¬ 
gations  should  be  carried  out  is  a  very  essential  and  tough 
problem,  which  relates  to  the  correctness  and  the  effectiveness 
of  operations. 

IV.  QUALITY-GUARANTEED  AND  ENERGY-EFFICIENT 

(QGEE)  Query  Procession  Algorithm  Description 
and  Discussion 

Keeping  these  two  problems:  quality-required  and  power- 
limit  in  our  mind,  we  propose  a  quality-guaranteed  and  energy- 
efficient  (QGEE)  query  processing  algorithm  for  WSNs. 
QGEE  employs  an  in-network  query  processing  method  to 
task  networks  through  declarative  queries,  which  is  critical 
for  reducing  network  traffic  when  accessing  and  manipulating 
sensor  data  . 

In  QGEE  algorithm,  only  a  subset  of  nodes  within  a  network 
will  be  chosen  to  acquire  readings  or  samples  corresponding 
to  the  fields  or  attributes  referenced  in  queries.  The  goal  of  our 
approach  is  to  reduce  interference  coming  from  measurements 
with  extreme  errors  and  to  minimize  energy  consumption  by 
providing  service  that  is  considerably  necessary  and  sufficient 
for  the  needs  of  applications.  Moreover,  according  to  the 
analysis  and  classification  on  sources  of  imperfect  information, 
we  employ  probabilistic  method  to  formulate  the  distribution 
of  them  in  terms  of  probability  distribution  function  (PDF). 
Finally  probabilistic  query  answers  are  acquired  on  uncertain 
data.  The  probability  in  a  query  answer  allows  users  to  place 
appropriate  confidence  in  it  as  opposed  to  having  an  incorrect 
answer  or  no  answer  at  all. 

A.  Confidence  Control  for  Query  Answers 

1)  Query  Vector  Space  Model  (VSM)  Design  and  Active 
Nodes  Selection:  In  information  retrieval,  VSM  is  a  very 
efficient  method  to  qualify  the  correlation  between  a  query 
and  all  candidate  documents.  If  we  treat  all  sensor  nodes  as 
candidate  documents  for  a  query,  the  correlation  between  a 
query  and  nodes  can  be  determined  through  the  same  principle 
used  in  information  retrieval.  Following  factors  are  considered 
for  our  VSM  vector  design: 

•  Node  Location 


Given  a  piece  of  region,  the  number  and  the  location  of 
nodes  determine  the  observe  proportion  together.  In  order 
to  employ  as  few  as  possible  nodes  to  cover  as  large 
as  possible  region,  we  should  select  those  nodes  located 
at  optimal  locations.  The  detail  on  determining  optimal 
locations  is  presented  in  Section  IV-A.2. 

•  Measurement  Quality 

Since  the  cost  and  the  measurement  quality  of  sensor 
nodes  are  related  to  each  other,  sensor  nodes  owning 
various  quality  levels  are  always  deployed  simultaneously 
in  a  WSN  for  economical  reasons.  Furthermore,  through 
a  query,  database  users  supply  not  only  what  information 
they  are  interested  in,  but  also  the  requirement  on  the 
quality  of  query  answers,  i.e.,  the  confidence  of  query 
answers.  In  this  case,  we  should  select  suitable  nodes  to 
response  queries.  In  Table  I,  note  that  the  RMSE  of  query 
answers  are  quite  different  under  various  measurement 
quality  (in  terms  of  different  variance  in  that  example). 

•  Remaining  Battery  Capacity 

Remaining  battery  capacity  of  sensor  nodes  is  our  third 
consideration  factor.  When  the  battery  of  a  node  is  used 
up,  data  observed  by  this  node  will  be  missed,  which  will 
increase  the  uncertainty  of  query  answer  at  some  degree. 
This  inspires  us  to  select  those  nodes  with  high  remaining 
battery  capacity,  so  that  all  expected  information  can  be 
collected. 

In  our  algorithm,  we  employ  VSM  to  combine  all  consid¬ 
ering  factors  to  select  the  most  related  nodes  to  participate 
query  processing.  The  query  vector  (T)  is  designed  as  T  = 
[Ri,Ad,Bm]. 

•  Ri  stands  for  location  relativity.  It  is  the  indicator  of  the 
distance  between  the  location  of  a  sensor  node  at  ( x,y ) 
and  the  optimal  location  at  ( xo,Vo )• 

R,-l-  +  (8) 

L 

where  L  is  a  factor  to  ensure  Ri  to  be  a  positive  number 
and  not  larger  than  one.  One  way  to  design  L  is  to  let 
its  value  equal  to  the  maximum  distance  of  two  nodes 
within  a  network. 

•  Ad  stands  for  measurement  quality.  Ad  equals  to  the 
confidence  of  measurement  bias.  For  example,  for  speed 
detecting  sensor  nodes,  CXM539  [10],  the  bias  is 
ilmGauss  and  owns  0.95  confidence.  In  this  case,  Ad  = 
0.95. 

•  Bm  stands  for  remaining  battery  capacity. 

When  a  query  submitted,  the  top-end  node  related  fixes  on 
optimal  locations  for  this  query  and  translates  the  query  from 
SQL  form  into  a  query  VSM  vector  To=[l,Ad)o,Bm,o]-  For 
instance,  T0  is  [l,p,  5]  according  to  the  query  given  in  Fig.  1. 
We  assume  the  maximum  battery  capacity  for  nodes  is  5  J.  To 
and  information  on  optimal  locations  will  be  flooded  over  the 
whole  network. 

Nodes  update  their  query  VSM  vectors  (T,;  = 

[Ri,i,  Ad,i,  Bm,i\  (i=l,2,  •,  n))  according  to  T0  and 


optimal  locations.  We  assume  that  there  are  n  nodes  in  this 
network.  Riti  is  defined  as: 

Ri,i  =  ma (9) 

3 

where  n^j  =  1  -  (Xi,  yj  is  the 

position  of  node  i,  and  (xo,j,  yo,j)  is  the  position  of  jth  optimal 
location. 

We  design  a  query  correlation,  which  we  refer  to  as  £,  to 
express  the  correlation  between  each  node  and  a  query.  We 
formulate  £  in  (10).  Observe  that,  query  correlation  (r0,Tt  is 
a  function  of  query  quality  requirement  and  node’s  energy, 
measurement  quality  and  location.  Moreover,  Cr0,Ti  is  com¬ 
puted  by  nodes  through  a  distributed  way. 

Cr0 ,Ti  =  simV5(To,  T*)  =  To  •  Ti 

_  1  X  Rltj  ~h  Adfi  ^  A-dj,  ~l~  ^m,0  X  Bm,i 

^(1  +  A2dfi  +  Bii0)(Rl  +  A2.  +  B2J) 


( g(d ))  of  nodes  in  a  WSN  is  defined  by  (11)  and  confidence 
of  query  answer  is  required  to  be  at  least  p  as  shown  in  the 
example  in  Fig.  1 . 

9{d)  =  —^=e~^  (11) 

er\/  2-7T 

where  d  is  the  distance  between  a  point  on  the  monitoring 
region  and  a  specific  node,  a2  is  the  variance  of  d.  g(d)  has 
the  similar  form  as  shown  in  Fig.  3(a). 

Compared  to  other  locations  within  a  disk,  measurements  of 
active  nodes  own  least  sensitivity/confidence  when  they  stand 
for  the  situation  at  the  disk’s  edge.  This  nature  inspires  us  to 
get  the  criteria  to  acquire  suitable  value  for  r.  That  is,  if  the 
sensitivity/confidence  is  equal  to  or  higher  than  p  at  the  edge 
of  disks,  we  can  ensure  that  the  measurements  of  active  nodes 
can  represent  the  situation  within  this  disk  with  p  confidence. 
We  drive  equation  (11)  to  determine  r  (see  equation  (12)). 

r  =  —  a^ln(2Tr<T2p2)  (12) 


The  higher  £  is,  the  higher  the  similarity  between  query  and 
a  node  is.  Inspired  by  this,  we  form  our  criteria  for  active  nodes 
choosing:  The  decision-which  nodes  are  active  to  respond 
queries-is  based  on  their  query  correlation.  That  is,  nodes  with 
highest  query  correlation  among  their  one-hop  neighbors  are 
chosen  to  participate  in  related  query  processing.  In  QGEE, 
active  nodes  are  chosen  locally  leveraging  cooperations  among 
nodes.  We  assume  that  each  node  knows  query  correlations  of 
its  one-hop  neighbors,  which  can  be  achieved  by  requiring 
each  node  to  broadcast  its  query  correlation  initially. 

2)  Optimal  Location  Determination:  We  model  the 
problem-determining  optimal  locations  for  a  query,  as  a  k- 
partial  set  cover  problem.  This  problem  is  defined  as  follows: 
Let  n  be  the  number  of  all  sensor  nodes,  n'  be  a  given  positive 
integer  so  that  n'  <  n.  If  we  have  k  same  disks  with  radius  r, 
the  k-partial  set  cover  problem  tries  to  solve  whether  k  disks 
can  cover  at  least  n1  nodes.  In  this  paper,  we  only  consider 
sensor  nodes  in  a  plane  (the  dimension  is  2).  This  kind  of 
k-partial  set  cover  problem  is  a  NP  problem. 

At  present,  all  known  algorithm  for  NP  problems  require 
time  that  is  exponential  to  the  problem  size.  It  is  unknown 
whether  there  are  any  faster  algorithms.  Therefore,  to  solve 
a  NP  problem  for  any  nontrivial  problem  size,  one  of  the 
approaches  is  approximation  algorithm,  which  can  acquire  the 
solution  during  polynomial  time.  SETCOVER  algorithm  [21] 
is  a  good  approximation  method  to  determine  the  value  of  k 
and  the  locations  of  these  k  disks  on  a  plane.  In  QGEE,  we 
choose  centers  of  those  k  disks  as  our  optimal  locations.  If 
we  set  these  k  disks  can  cover  all  sensor  nodes  (i.e,,  n1  =  n), 
those  nodes  locating  at  the  centers  of  these  disks  can  almost 
monitor  the  region  completely  interested  by  users. 

In  QGEE,  considering  the  influence  of  PSF  of  nodes  on 
uncertainty  of  query  answers,  we  adaptively  determine  the 
value  of  the  radius  r  of  disks  according  to  users’  quality 
requirements  instead  of  fixing  it.  We  illustrate  the  process  of 
calculating  r  with  a  simple  example.  We  assume  that  PSF 


Note  that  r  is  a  function  of  standard  variance  of  PSF  o  and 
query  quality  requirement  p.  If  we  fixed  o,  r  will  decrease 
with  the  increase  of  p.  That  means,  with  higher  query  quality, 
smaller  disks  are  used  to  search  the  optimum  locations  and 
more  active  nodes  are  needed  for  a  query. 

3)  Sample  Size  Determination  and  Semi-Manufactured 
Query  Answer  Acquisition:  We  have  picked  up  a  set  of  nodes 
to  respond  a  query.  While,  “How  many  measurements  should 
be  included  in  one  sample?”  is  the  question  we  will  answer 
in  this  Section.  Sample  (any  subset  of  a  population)  size 
determination  refers  to  the  process  of  determining  exactly  how 
many  samples  should  be  measured/observed  in  order  that  the 
sampling  distribution  of  estimators  meets  users’  pre-specified 
target  precision  [23]. 

Since  nodes’  readings  are  subject  to  many  small  and  random 
errors  which  are  caused  by  limitations  of  devices’  hardware 
and  environmental  noise,  uncertainty  is  inherent  regarding  to 
true  values.  Hence  nodes  reading  (x)  can  be  expressed  as: 

x  =  v  +  em  +  q  (13) 


where  v  is  the  true  value,  em  is  the  measurement  error 
introduced  by  limitations  of  devices’  hardware,  and  rj  is  the 
environmental  noise  which  is  considered  as  white  Gaussian 
noise  in  this  paper  and  rj  ~  iV(0,  ^).  Based  on  central 
limit  theorem  [13],  the  probability  distribution  of  measure¬ 
ment  errors  complies  with  a  normal  distribution.  That  is, 
eTO  ~  N( 0,cr2).  Generally,  in  product’s  technical  datasheet, 
manufactories  supply  the  information  on  measurement  errors. 
For  example,  as  we  mentioned  above,  the  bias  for  CXM539  is 
±1  Gauss  with  0.95  confidence.  That  means  for  sensor  nodes 
CXM593,  o2  =  0.1302.  For  general  cases,  if  we  know  the 
maximum  bias  Ax  and  its  confidence  p,  we  can  obtain  the 
general  expression  for  o2.  That  is 


2  Ax2 

'e  =  WR w 


(14) 


where  Q(x )  stands  for  Q-Function,  defined  as  Q{ x)  = 

fOO  i  —  ItLj 

Sx  he  2dy 

Moreover,  em  and  nQ  are  independent.  Therefore,  nodes 
reading  also  complies  with  a  normal  distribution  with  \ix- mean 
and  ax -standard  derivation  given  in  (15). 


fix  =  v 


and 


(15) 


Therefore  the  PDF  of  nodes’  reading  fx(x)  is 

fx  (*)  =  -  —  ■  1  (16) 

+  f  )2 


Based  on  nodes  reading,  the  estimator  of  the  true  value  is 
defined  as  xn  =  £  J2j=i  xj<  thus  the  probability  distribution 
function  of  xn  (/x-  (xn))  is  similar  to  fx(g)  with  p,n  =  px 
and  a\  =  That  is 


fxn  (^71 ) 
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(17) 


In  QGEE,  we  let  Ax  as  the  margin  of  error  between  the 
estimator  (xn)  and  the  true  value  (v)  to  reflect  the  target 
precision  of  queries,  as  well  as  we  specify  our  tolerance  for 
making  this  error  not  smaller  than  p.  The  criteria  for  sample 
size  determination  is  simply  stated  as: 


Pr{\&n  -  u|  <  Ax}  >  p  (18) 


We  have  known  the  PDF  of  xn,  hence  the  probability  of 
the  estimation  error  which  not  larger  than  d  is 

Ax 


Pr{\xn  —  u|  <  Ax}  —  1  -  2 Q( 


ol+- 


(19) 


Solving  (18)  and  (19)  for  n,  we  obtain 


n  > 


(^  +  ^)[Q-1(V)]2 

Ax 2 


(20) 


Since  a  statistic  measurement  on  samples  can  rarely,  if 
ever,  be  expected  to  be  exactly  equal  to  a  parameter,  it  is 
important  that  an  estimation  is  accompanied  by  a  statement 
which  describes  the  precision  of  this  estimation.  Confidence 
intervals  [7]  provide  a  method  of  stating  both  how  close  the 
value  of  a  statistic  being  likely  to  be  value  of  a  parameter 
and  the  chance  of  being  close.  An  confidence  interval  of  an 
attribute,  denoted  by  Ui  is  a  interval  [k,hi]  such  that  lt  and 
hi  are  real-valued,  and  that  the  condition  hi  >  li  holds. 

Note  that  (18)  is  the  same  statement  made  when  defining  a 
100  x  p%  confidence  interval,  and  d  is  about  half  of  the  width 
of  the  confidence  interval.  Using  the  sample  size  defined  by 
(20)  to  estimate  the  true  value  ( v ),  we  have 


Pr{xn  —  Ax  <  V  <  xn+  Ax}  >  p  (21) 


With  (21),  we  obtain  a  bounded  value,  i.e.,  v  e  [xn  — 
Ax,xn  4-  Ax],  which  owns  p  confidence.  We  call  this  kind 
of  query  answers  from  active  nodes  as  “semi-manufactured” 
query  answers. 

Since  heterogenous  is  one  of  natures  of  general  WSNs,  that 
is,  measurement  quality  for  each  node  may  not  be  same,  the 
sample  size  is  given  in  (22)  and  confidence  interval  for  node 
i  is  Vi  s  [xn,i  —  Axi,  Xnti  +  Ax*]  with  pi  confidence. 
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4)  Information  Collection:  After  active  nodes  are  chosen,  a 
data  centric  routing  algorithm,  EM-GMR  [11]  is  employed  by 
QGEE,  which  is  a  multipath,  power-aware  and  mobility-aware 
routing  scheme.  EM-GMR  is  used  to  establish  the  route-tree 
from  active  nodes  to  front-end  nodes  for  query  answer  return. 
EM-GMR  uses  reactive  networking  approach,  in  which  it  finds 
a  route  only  when  a  message  is  to  be  delivered  from  source 
to  destination. 

EM-GMR  scheme  consists  of  route  discovery  phase,  route 
reconstruction  phase,  and  route  deletion  phase.  In  the  route 
discovery  phase,  the  source  node  uses  a  fuzzy  logic  system 
(FLS)  [8]  to  evaluate  all  eligible  nodes  (closer  to  the  des¬ 
tination  location)  in  its  communication  range  based  on  the 
parameters  of  each  node:  distance  to  the  destination,  remaining 
battery  capacity,  and  degree  of  mobility.  The  source  node 
chooses  the  top  M  nodes  based  on  the  degree  of  the  possibility 
(output  of  FLS).  The  source  node  sends  a  Route  Notification 
(RN)  packet  to  each  desired  node,  and  each  desired  node  will 
reply  using  a  REPLY  packet  if  it  is  available.  If  after  a  certain 
period  of  time,  the  source  node  did  not  receive  REPLY  from 
some  desired  node,  it  will  pick  the  node  with  the  M  +  1st 
degree  of  selection  possibility.  In  the  second  hop,  selected 
nodes  in  each  path  will  choose  its  next  hop  node  uses  a  FLS. 

Note  that  EM-GMR  considers  distance  to  the  sensor  node, 
remaining  battery  capacity,  and  mobility  of  each  sensor  node 
during  route  path  setting  up.  This  scheme  could  tremendously 
reduce  the  frame  loss  rate  and  link  failure  rate  since  mobility 
was  considered,  so  that  incompleteness  information  caused  by 
poor  link  quality  is  reduced  at  certain  degree. 


B.  Energy  Consumption  Control  for  Question  Processing 

In  energy  consumption  control,  we  employ  three  strategies. 
They  are  active  nodes  number  control,  sample  size  control  and 
link  quality  control. 

First,  in  the  query  SVM  design,  node  location  is  included 
besides  measurement  quality  and  remaining  battery  capacity, 
since  it  is  directly  related  to  the  necessary  number  of  active 
nodes  to  cover  whole  monitoring  region.  Through  solving 
optimal  location  problem,  we  can  employ  as  few  as  possible 
nodes  to  cover  as  large  as  possible  monitoring  region  in  order 
to  carry  out  energy  reservation  task. 

Second,  in  order  to  ensure  the  confidence  of  estimators 
satisfy  users’  requirements,  we  obtain  (22)  for  determining 
the  value  of  sample  size.  However,  too  large  a  sample  implies 
a  waste  of  resources,  and  too  small  a  sample  diminishes  the 


utility  of  the  results.  In  our  algorithm,  we  let  (22)  specify  the 
value  of  sample  size  during  information  sensing.  Therefore,  we 
can  acquire  enough  samples  to  met  users’  pre-specified  target 
precision,  at  the  same  time  reduce  the  energy  consumption  for 
data  sensing. 

Third,  we  tremendously  reduce  the  frame  loss  rate  and 
link  failure  rate  through  choosing  more  suitable  nodes  to  set 
up  route-tree  for  queries.  With  this  improvement,  we  can 
reduce  energy  consumption  for  route-tree  maintenance  and 
information  retransmission. 


C.  Final  Query  Answer  Acquisition 

After  queries  have  been  optimized  and  disseminated,  the 
query  processor  begins  to  execute  them-processing  all  semi¬ 
manufactured  query  answers  to  acquire  the  final  query  an¬ 
swers.  Aggregation  is  required  in  many  database  applications, 
which  is  used  in  statistical  queries  that  summarize  information 
from  database  tuples.  Common  functions  applied  to  collections 
of  numeric  values  include  SUM,  AVERAGE,  MAXIMUM, 
and  MINIMUM.  In  this  paper,  our  discuss  on  how  to  obtain 
final  query  answers  focuses  on  those  most  often  used  aggre¬ 
gation  operations:  MAXIMUM,  MINIMUM,  and  AVERAGE. 

Returned  semi-manufactured  answers  are  confidence  inter¬ 
vals,  i.e.,  [xn,i  -  Axi,Xn,i  +  Ax,]  with  pi  confidence  (i  = 
1, 2  ■  •  •  ,  ip).  For  simplified  reason,  we  let  I;  =  x^ti  -  Axit 
K  =  x'n  i  +  Axi.  We  assume  there  are  ip  active  nodes  for  a 
query. 


1)  MAXIMUM/MINIMUM  Aggregation:  We  let  Zmax  = 
maxi(xi)  and  Zmin  =  min,:(x,)  (i  —  1,2 ,---,ip).  The 
cumulative  distribution  function  (CDF)  of  xHii  is  given  in  (23) 
according  to  (17). 


Fxn  4(fin,i)  =  Qi^Ai-hlA) 

aniyi 


(23) 


Since  measurements  from  individual  active  nodes  are  indepen¬ 
dent  with  each  other,  the  CDFs  for  Zmax  and  Zm%n  is  given 
in  (24)  for  MAXIMUM  and  in  (25)  for  MINIMUM. 


i=  1  i= 1 


fariiyi) 


&tu  A 


)  (24) 


And 


Fzmtn{z)  =  l-na-^w) 
1=1 


=  1  -  n A"i’l)))  (25) 


i= 1 


P'ma,  =  iEPi  (27) 

V  i= 1 

For  MINIMUM  aggregation,  the  final  query  answer  is 
Zmin  e  [l’min,  h’m  J  with  p’min  confidence  in  bounded  prob¬ 
ability  form.  Where 

Cin  =  min{/,}  and  timin  =  min{/ii}  (28) 

l  l 

1  * 

P'min  =  -7  EP*  )29) 

v  i-1 

2)  AVERAGE  Aggregation:  In  this  aggregation  operation,  a 
derivative  value  over  a  group  of  sensor  nodes’  data  is  returned. 
Zavg  =  |  EjU  xnjj.  The  PDF  for  Zavg  (fZavg(z))  has  the 
similar  distribution  to  fg^(xn).  But  the  mean  and  variance 

are  and  XjF  E?=  i  individually. 

For  AVG  aggregation,  the  final  query  answer  is  Zavg  € 
[l'avg,  h'avg\  with  p'avg  confidence  in  bounded  probability  form. 
Where 

=  and  Kv^lEhi  (30) 

*  j= i  v  j= i 

Pav 9  =  EPi  ^31) 

y  i=l 

V.  Simulation  and  Performance  Evaluation 

Nodes  are  randomly  deployed  in  an  area  of  10  x  10m2, 
and  sensing  range  is  lm.  The  initial  energy  of  sensor  nodes 
i  uniformly  distributes  within  [0,5] J.  We  run  Monte  Carlo 
simulations  to  remove  the  randomicity  of  simulation  results. 
We  compare  our  QGEE  against  the  query  processing  method 
without  any  query  optimization. 

The  energy  consumption  model  for  data  sensing  is  shown 
in  (32). 

Eq  =  Ese  *  St  (32) 


where  Eq  is  the  energy  consumed  by  processing  a  query.  Ese 
is  the  energy  consumed  by  data  sensing.  St  is  the  sample 
period.  In  this  simulation,  we  choose:  Ese  =  bnJ/ sample. 
We  use  the  same  energy  consumption  model  as  in  [22]  for 
the  radio  hardware.  To  transmit  an  Z-symbol  message  for  a 
distance  d,  the  radio  expends: 

Erx{l,d)  ^JTx—elec{l)'^~rP'Tx—am'p{l,d)  ZxFZe(ec4-Zxey5XcZ 

(33) 

and  to  receive  this  message,  the  radio  expends: 

Erx  —lx  Eeiec  (34) 


For  MAXIMUM  aggregation,  the  final  query  answer  is 
Zmax  €  \l'max,  timax]  with  p'max  confidence  in  bounded 
probability  form.  Where 

I'max  =  max{Zj}  and  h'max  =  ma x{ft,}  (26) 

1  % 


The  electronics  energy,  Eeiec,  as  described  in  [22],  depends  on 
factors  such  as  coding,  modulation,  pulse-shaping  and  matched 
filtering.  The  amplifier  energy,  efsxd2  depends  on  the  distance 
to  the  receiver  and  the  acceptable  bit  error  rate.  In  this  paper, 
we  choose:  Eeiec  =  50 nJ/syn,  e/s  =  10 pJ/sym/m2. 


A.  Active  Nodes  Selection  Scheme  Performance 

In  Fig.  4,  we  plot  the  query  index  versus  the  nodes  dead 
time.  We  can  see  that  after  processing  about  20  queries,  all 
nodes,  without  query  optimization,  use  up  their  energy.  But 
for  QGEE,  the  whole  network  is  not  down  until  53  queries 
are  completed.  Therefore,  QGEE  can  reserve  50%  of  energy 
on  processing  same  number  of  queries. 


Fig.  4.  Nodes  Dead  Time 

In  Fig.  5,  we  compare  the  observation  covering  rate  of  these 
two  schemes.  Observed  that,  QGEE  employs  70  —  45  =  25 
less  nodes  to  cover  90%  area  interested.  This  simulation  result 
illustrates  the  reason  why  our  QGEE  can  implement  energy 
reservation.  That  is,  about  x  100  =  35.71%  nodes  switch 
to  energy  saving  model  during  query  processing. 


Fig.  5.  Observation  Coverage  Rate 

By  employing  our  QGEE,  the  energy  is  saved  and  the 
lifetime  of  network  is  extended.  But  the  cost  to  achieve 
this  improvement  is  a  certain  degree  of  observation  covering 
rate  decreasing.  Fig.  6  shows  that  the  biggest  decrease  of 
observation  covering  rate  is  16.6%  for  QGEE. 

B.  EM-GMR  Performance 

We  compared  our  EM-GMR  against  the  geographical  mul¬ 
tipath  routing  (GMR)  scheme  where  only  distance  to  the 
destination  is  considered.  In  Fig.  7,  we  plotted  the  simulation 
time  versus  the  number  of  nodes  dead.  Observe  that  when 


Fig.  6.  Decrease  of  Observation  Coverage  Rate 


50%  nodes  (30  nodes)  die  out,  the  network  lifetime  for  EM- 
GMR  has  been  extended  about  17 125  25  =  40%.  In  Fig.  8,  we 
compared  the  frame  loss  rate  of  these  two  scheme.  Observe 
that  our  EM-GMR  outperforms  the  GMR  for  about  20%  less 
frame  loss.  The  average  latency  during  transmission  (end-to- 
end)  is  419.68ms  for  our  EMGMR  and  407.5ms  for  GMR, 
and  link  failure  rate  for  EMGMR  is  5.68%,  but  for  GMR  it  is 


Fig.  7.  Simulation  time  versus  number  of  nodes  dead 


C.  Probabilistic  Answers  Acquisition  Scheme  Performance 

In  this  simulation,  we  give  various  query  answer  confi¬ 
dence  requirement  (i.e.,  various  value  for  p).  To  simplify  the 
simulation  scenarios,  we  set  there  are  enough  nodes  satis¬ 
fying  measurement  quality  requirements.  For  MAXIMUM, 
MINIMUM  and  AVERAGE  aggregation  operation,  we  check 
the  probability  of  true  values  locating  within  the  confidence 
intervals  acquired  at  front-end  nodes  (see  Table  II).  Note 
that,  our  QGEE  can  successfully  obtain  suitable  confidence 
intervals  to  guarantee  the  true  value  of  query  answers  locating 
within  this  interval  with  a  probability  (p2),  which  is  equal  to 
or  larger  than  the  pre-specified  probability  (pi)  by  users. 

VI.  Conclusions 

In  this  paper,  we  propose  a  quality-guaranteed  and  energy- 
efficient  algorithm  (QGEE)  for  sensor  database  systems.  Given 


Fig.  8.  Simulation  time  versus  frame  loss  rate 
TABLE  II 

Confidence  for  query  answers,  pi  is  the  pre-specified  value  for 

p.  P2  is  THE  ACQUIRED  VALUE  FOR  p  USING  QGEE. 


MAXIMUM 

MINIMUM 

AVERAGE 

P2  ... 

Po. 

0.8 

0.8133 

0.8133 

0.8112 

0.9 

0.9139 

0.9151 

0.9050 

0.912 

0.9122 

0.9246 

0.93 

0.9318 

0.9301 

0.9334 

0.948 

0.942 

0.9428 

0.95 

0.958 

0.9561 

0.9551 

a  query,  our  QGEE  can  adaptively  form  an  optimal  query 
plan  in  terms  of  energy  efficiency  and  query  quality.  Our 
approach  can  reduce  interference  coming  from  measurements 
with  extreme  errors  and  minimize  energy  consumption  by  pro¬ 
viding  service  that  is  considerably  necessary  and  sufficient  for 
the  need  of  applications.  Moreover,  we  employ  probabilistic 
method  to  formulate  the  distribution  of  imperfect  information 
sources  in  terms  of  probability  distribution  function  (PDF), 
and  acquire  probabilistic  query  answers  on  uncertain  data.  The 
probabilities  to  an  answer  allow  users  to  place  appropriate 
confidence  in  it. 

The  simulation  results  demonstrate  that  our  algorithm  can 
reduce  resource  usage  about  50%,  the  frame  loss  rate  about 
20%  and  supply  quality  satisfied  query  answers  to  users. 
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Abstract 

In  this  paper,  we  introduce  a  new  method  for  packet  transmission  delay  analysis  and  pre¬ 
diction  in  mobile  ad  hoc  networks.  We  use  fuzzy  logic  system  (FLS)  to  coordinate  physical 
layer  and  data  link  layer.  We  demonstrate  that  a  type-2  fuzzy  membership  functions  (MFs), 
i.e.,  the  Gaussian  MFs  with  uncertain  variance  is  most  appropriate  to  model  BER  and  MAC 
layer  service  time.  Two  FLSs:  a  singleton  type-1  FLS  and  an  interval  type-2  FLS  are  designed 
to  predict  the  packet  transmission  delay  based  on  the  BER  and  MAC  layer  service  time.  Sim¬ 
ulation  result  shows  that  the  interval  type-2  FLS  performs  better  than  the  type-1  FLS.  And 
we  use  the  outcomes  of  FLS  predictors  to  control  the  transmission  powers.  Simulation  results 
illustrate  us  the  performances  of  the  energy  consumption,  average  delay  and  throughput.  They 
show  that  the  interval  type-2  FLS  performs  better  than  the  type-1  FLS.  And  we  use  the  actual 
transmission  delay  to  get  the  performance  bound. 

Index  Terms  :  wireless  Ad  Hoc  networks,  cross-layer  design,  fuzzy  logic  system, 
interval  type-2  fuzzy  sets,  packet  transmission  delay  analysis 
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1  Introduction 


The  demand  for  Quality  of  Service  (QoS)  in  mobile  ad  hoc  networks  is  growing  in  a  rapid  speed. 
To  enhance  the  QoS,  we  consider  the  combination  of  physical  layer  and  data- link  layer  together, 
a  cross-layer  approach.  A  strict  layered  design  is  not  flexible  enough  to  cope  with  the  dynamics 
of  the  mobile  ad  hoc  networks  [1].  Cross-layer  design  could  introduce  the  layer  interdependencies 
to  optimized  overall  network  performance.  The  general  methodology  of  cross-layer  design  is  to 
maintain  the  layered  architecture,  capture  the  important  information  that  influence  other  layers, 
exchange  the  information  between  layers  and  implement  adaptive  protocols  and  algorithms  at  each 
layer  to  optimize  the  performance. 

Lots  of  previous  works  have  focused  on  cross-layer  design  for  QoS  provision.  Liu  [2]  combine 
the  AMC  at  physical  layer  and  ARQ  at  the  data  link  layer.  Ahn  [3]  use  the  info  from  MAC  layer  to 
do  rate  control  at  network  layer  for  supporting  real-time  and  best  effort  traffic.  Akan  [4]  propose  a 
new  adaptive  transport  layer  suite  including  adaptive  transport  protocol  and  adaptive  rate  control 
protocol  based  on  the  lower  layer  information. 

However,  cross-layer  design  can  produce  unintended  interactions  among  protocols,  such  as  an 
adaptation  loops.  It  is  hard  to  characterize  the  interaction  at  different  layers  and  joint  optimization 
across  layers  may  lead  to  complex  algorithm. 

In  this  paper,  we  discuss  one  of  the  parameters  for  QOS:  packet  transmission  delay.  And  our 
algorithm  is  quite  different  from  all  the  previous  works.  We  propose  to  use  the  Fuzzy  Logic  System 
(FLS)  for  packet  transmission  delay  analysis  and  prediction.  We  apply  both  a  singleton  type-1  FLS 
and  an  interval  type-2  FLS  for  the  analysis  and  prediction. 

We  apply  the  transmission  delay  predictors  to  control  the  transmission  power.  The  simulation 
achieves  performance  parameters  of  average  delay,  energy  consumption  and  throughput.  Assume 
we  know  the  actual  transmission  delay,  we  also  get  these  parameters  as  the  performance  bounds. 

The  remainder  of  this  paper  is  structured  as  following.  In  section  II,  we  introduce  the  prelim¬ 
inaries.  In  section  III,  we  make  an  overview  of  fuzzy  logic  systems.  In  section  IV,  we  apply  the 
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FLS  into  the  cross-layer  design.  Simulation  results  and  discussions  are  presented  in  section  V.  In 
section  VI,  we  conclude  the  paper. 


2  Preliminaries 

2.1  IEEE  802.11a  OFDM  PHY 

The  physical  layer  is  the  interface  between  the  wireless  medium  and  the  MAC  [5].  The  principle 
of  OFDM  is  to  divide  a  high-speed  binary  signal  to  be  transmitted  over  a  number  of  low  data-rate 
subcarriers.  A  key  feature  of  the  IEEE  802.11a  PHY  is  to  provide  8  PHY  modes  with  different 
modulation  schemes  and  coding  rates,  making  the  idea  of  link  adaptation  feasible  and  important. 


2.2  IEEE  802.11  MAC 


The  802.11  MAC  uses  Carrier-Sense  Multiple  Access  with  Collision  Avoidance  (CSMA/CA)  to 
achieve  automatic  medium  sharing  between  compatible  stations.  In  CSMA/CA,  a  station  senses 
the  wireless  medium  to  determine  if  it  is  idle  before  it  starts  transmission.  If  the  medium  appears 
to  be  idle,  the  transmission  may  proceed,  else  the  station  will  wait  until  the  end  of  the  in-progress 
transmission.  A  station  will  ensure  that  the  medium  has  been  idle  for  the  specified  inter-frame 
interval  before  attempting  to  transmit. 

Besides  carrier  sense  and  RTS/CTS  mechanism,  an  acknowledgment  (ACK)  frame  will  be  sent 
by  the  receiver  upon  successful  reception  of  a  data  frame.  Only  after  receiving  an  ACK  frame 
correctly,  the  transmitter  assumes  successful  delivery  of  the  corresponding  data  frame.  The  sequence 
for  a  data  transmission  is:  RTS-CTS-DATA-ACK. 

A  mobile  node  will  retransmit  the  data  packet  when  finding  failing  transmission.  Retransmission 
of  a  signal  packet  can  achieve  a  certain  probability  of  delivery.  There  is  a  relationship  between  the 
probability  of  delivery  p  and  retransmission  times  n  [6]: 


n  =  1.451n- - 

1  ~P 


(1) 
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The  IEEE  802.11  standard  requires  that  a  data  frame  is  discarded  by  the  transmitter’s  MAC 
after  certain  number  of  unsuccessful  transmission  attempts.  According  to  the  requirement  of  prob¬ 
ability  of  delivery,  we  choose  the  minimum  number  of  retransmission. 

When  MAC  layer  acquires  access  to  the  channel,  the  nodes  will  exchange  the  RTS-CTS-DATA- 
ACK  packets.  After  the  transmitters  receive  an  ACK  packet,  a  packet  is  transmitted  successfully. 
In  this  paper,  we  assume  that  there'will  be  always  best-effort  traffic  present  that  can  be  locally  and 
rapidly  rate  controlled  in  an  independent  manner  at  each  node  to  yield  necessary  low  delays  and 
stable  throughputs. 

2.3  Bit  Error  Rate 

BER  is  the  percentage  of  bits  with  errors  divided  by  the  total  number  of  bits  that  have  been 
transmitted,  received  or  processed  over  a  given  time  period.  It  is  a  measure  of  transmission  quality. 
The  high  BER  means  high  packets  loss  rate.  Requests  for  resends  will  increase  latency.  For  delay 
insensitive  traffic  requires  a  very  low  BER. 

2.4  MAC  Layer  Service  Time 

There  are  three  basic  processes  when  the  MAC  layer  transmits  a  packet  [7]:  the  decrement  process 
of  the  backoff  timer,  the  successful  packet  transmission  process  that  takes  a  time  period  of  Tsuc 
and  the  packet  collision  process  that  takes  a  time  period  of  Tcoi ■  Here,  Tsuc  is  the  random  variable 
representing  the  period  that  the  medium  is  sensed  busy  because  of  a  successful  transmission,  and 
Tqqi  is  the  random  variable  representing  the  period  that  the  medium  is  sensed  busy  by  each  station 
due  to  collisions.  The  MAC  layer  service  time  is  the  time  interval  from  the  time  instant  that  a 
packet  becomes  the  head  of  the  queue  and  starts  to  contend  for  transmission,  to  the  time  instant 
that  either  the  packet  is  acknowledged  for  a  successful  transmission  or  the  packet  is  dropped.  This 
time  is  important  when  we  examine  the  performance  of  higher  protocol  layers. 
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2.5  Packet  Transmission  Delay 


The  packet  delay  represents  the  time  it  took  to  send  the  packet  between  the  transmitter  and  the 
next-hop  receiver,  including  the  deferred  time  and  the  time  to  fully  acknowledge  the  packet.  The 
packet  transmission  delay  between  the  mobile  nodes  includes  three  parts:  the  wireless  channel 
transmission  delay,  the  Physical/MAC  layer  transmission  delay,  and  the  queuing  delay  [8]. 

Defining  D  as  the  distance  between  two  nodes  and  C  as  the  light  speed,  the  wireless  channel 
transmission  delay  as: 

Delaych  =  ^  (2) 

The  Physical/MAC  layer  transmission  delay  will  be  decided  by  interaction  of  the  transmitter 
and  the  receive  channel,  the  node  density  and  the  node  traffic  intensity  etc. 

The  queuing  delay  is  decided  by  the  mobile  node  I/O  system-processing  rate,  the  subqueue 
length  in  the  node. 

In  order  to  make  the  system  “stable”,  the  rate  at  which  node  transfers  packets  intended  for 
its  destination  must  satisfy  all  nodes  that  the  queuing  lengths  will  not  be  infinite  and  the  average 
delays  will  be  bounded. 

2.6  Energy 

A  mobile  node  consumes  significant  energy  when  it  transmits  or  receives  a  packet.  But  we  will  not 
consider  the  energy  consumed  when  the  mobile  node  is  idle. 

The  distance  between  two  nodes  are  variable  in  the  mobile  ad  hoc  networks  and  the  power  loss 
model  is  used.  To  send  the  packet,  the  sender  consumes  [9], 

Ptr,  =  Pelec  +  efs  '  d2  (3) 

and  to  receive  the  packet,  the  receiver  consumes, 

Prx  =  Pelec  (4) 
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Where  Peiec  represents  the  power  that  is  necessary  for  digital  processing,  modulation,  and  e/s 
represents  the  power  dissipated  in  the  amplifier  for  the  free  space  distance  d  transmission. 

2.7  One-step  Markov  Path  Model 

The  mobile  nodes  are  roaming  independently  with  variable  ground  speed.  The  mobility  model  is 
called  one-step  Markov  path  model  [10].  The  probability  of  moving  in  the  same  direction  as  the 
previous  move  is  higher  than  other  directions  in  this  model,  which  means  this  model  has  memory. 
Fig.l  shows  the  probability  of  the  six  directions. 


Figure  1:  One-step  Markov  Path  Model 

3  Overview  of  Interval  Type- 2  Fuzzy  Logic  Systems 

Figure  2  shows  the  structure  of  a  type-2  FLS  [11].  It  is  very  similar  to  the  structure  of  a  type-1 
FLS  [12].  For  a  type-1  FLS,  the  output  processing  block  only  contains  the  defuzzifier.  We  assume 
that  the  reader  is  familiar  with  type-1  FLSs,  so  that  here  we  focus  only  on  the  similarities  and 
differences  between  the  two  FLSs. 

The  fuzzifier  maps  the  crisp  input  into  a  fuzzy  set.  This  fuzzy  set  can,  in  general,  be  a  type-2 

set. 

In  the  type-1  case,  we  generally  have  “IF-THEN”  rules,  where  the  Zth  rule  has  the  form  “Rl  :  IF 
xi  is  F^  and  is  F2  and  •  •  •  and  xp  is  Fj,,  THEN  y  is  G*”,  where:  x,s  are  inputs;  F|s  are  antecedent 
sets  (i  =  1, . . .  ,p);  y  is  the  output;  and  G^s  are  consequent  sets.  The  distinction  between  type-1 
and  type-2  is  associated  with  the  nature  of  the  membership  functions,  which  is  not  important  while 
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TYPE-2  FUZZY  LOGIC  SYSTEM 


Figure  2:  The  structure  of  a  type-2  FLS.  In  order  to  emphasize  the  importance  of  the  type-reduced 
set,  we  have  shown  two  outputs  for  the  type-2  FLS,  the  type-reduced  set  and  the  crisp  defuzzified 
value. 

forming  rules;  hence,  the  structure  of  the  rules  remains  exactly  the  same  in  the  type-2  case,  the 
only  difference  being  that  now  some  or  all  of  the  sets  involved  are  of  type-2;  so,  the  Zth  rule  in  a 
type-2  FLS  has  the  form  “Rl  :  IF  is  F^  and  x-i  is  F2  and  •  •  ■  and  xp  is  Fj,,  THEN  y  is  G!”. 

In  the  type-2  case,  the  inference  process  is  very  similar  to  that  in  type-1.  The  inference  engine 
combines  rules  and  gives  a  mapping  from  input  type-2  fuzzy  sets  to  output  type-2  fuzzy  sets.  To 
do  this,  one  needs  to  find  unions  and  intersections  of  type-2  sets,  as  well  as  compositions  of  type-2 
relations. 

In  a  type-1  FLS,  the  defuzzifier  produces  a  crisp  output  from  the  fuzzy  set  that  is  the  output 
of  the  inference  engine,  i.e.,  a  type-0  (crisp)  output  is  obtained  from  a  type-1  set.  In  the  type-2 
case,  the  output  of  the  inference  engine  is  a  type-2  set;  so,  “extended  versions”  (using  Zadeh’s 
Extension  Principle  [13])  of  type-1  defuzzification  methods  was  developed  in  [11].  This  extended 
defuzzification  gives  a  type-1  fuzzy  set.  Since  this  operation  takes  us  from  the  type-2  output  sets 
of  the  FLS  to  a  type-1  set,  this  operation  was  called  “type-reduction”  and  the  type-reduced  set  so 
obtained  was  called  a  “type-reduced  set”  [11].  To  obtain  a  crisp  output  from  a  type-2  FLS,  we  can 
defuzzify  the  type-reduced  set. 

General  type-2  FLSs  are  computationally  intensive,  because  type-reduction  is  very  intensive. 
Things  simplify  a  lot  when  secondary  membership  functions  (MFs)  axe  interval  sets  (in  this  case, 
the  secondary  memberships  are  either  0  or  1).  When  the  secondary  MFs  are  interval  sets,  the  type-2 
FLSs  were  called  “interval  type-2  FLSs”.  In  [14],  Liang  and  Mendel  proposed  the  theory  and  design 
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of  interval  type-2  fuzzy  logic  systems  (FLSs).  They  proposed  an  efficient  and  simplified  method 
to  compute  the  input  and  antecedent  operations  for  interval  type-2  FLSs,  one  that  is  based  on  a 
general  inference  formula  for  them.  They  introduced  the  concept  of  upper  and  lower  membership 
functions  (MFs)  and  illustrate  their  efficient  inference  method  for  the  case  of  Gaussian  primary 
MFs.  They  also  proposed  a  method  for  designing  an  interval  type-2  FLS  in  which  they  tuned  its 
parameters. 

In  an  interval  type-2  FLS  with  singleton  fuzzification  and  meet  under  minimum  or  product 
f-norm,  the  result  of  the  input  and  antecedent  operations,  F*,  is  an  interval  type-1  set,  i.e.,  Fl  = 
[/*>/*],  where  fl  and  /  simplify  to 


=  (zi )*...*/£-,  (xp) 


(5) 


and 

f  =  (zi)*...*Mg,!  (xp)  (6) 

r  1  r  p 

where  a (i  =  1, . . .  ,p)  denotes  the  location  of  the  singleton. 

In  this  paper,  we  use  center-of-sets  type-reduction,  which  can  be  expressed  as: 

YoosiY1,  ■■■,¥**, F\---,Fm)  =  [yh  yr]  =  fyl...  JyM  $fM  1  (7) 

where  Ycos  is  an  interval  set  determined  by  two  end  points,  yi  and  yr;  /*  e  Fl  = 
yl  G  Yl  —  [yj,  y*],  and  Y1  is  the  centroid  of  the  type-2  interval  consequent  set  G?  ;  and,  i  =  1, . . . ,  M. 
Because  Tcos  is  an  interval  set,  we  defuzzify  it  using  the  average  of  yi  and  yT;  hence,  the  defuzzified 
output  of  an  interval  type-2  FLS  is 


/(*)  = 


Vl  +  Vr 


(8) 
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4  Modeling  BER  and  MAC  Layer  service  time  with  Gaussian 
Membership  Function 

4.1  Analyzing  and  Modelling  BER 

Let  p  be  the  probability  that  bit  is  error  in  any  given  time.  So  p  can  be  described  as  a  random 
variable  with  a  known  mean  value  Ea. 

Now,  at  any  given  time  the  bit  is  error  with  probability  p  and  the  bit  is  correct  with  probability 
1  -p.  Since  the  bit  is  either  error  or  correct,  the  number  of  the  bits  it  is  error(j E^)  for  a  fixed  length 
transmission  bits  is  binomial  random  variable.  The  length  of  the  transmission  bits  is  Nt,  The 
probability  that  Eb  takes  any  value  x  is  : 

P{Eb  =  x}  =  C£V(l-p)"‘-*  (9) 

As  the  number  of  the  length  of  the  transmission  bits  increase,  the  binomial  distribution  is  approx¬ 
imated  to  a  normal  distribution,  with  mean  p  =  pNt  and  variance  a2  =  p{l-p)Nt. 

In  this  paper,  we  set  up  fine  membership  functions  (MFs)  for  BER.  From  the  original  data  of 
BER  shown  in  Table  I,  we  decomposed  the  whole  data  sets  into  ten  segments  and  computed  the 
mean  m,  and  std  cr*  of  the  BER  of  the  ith  segment,  *  =  1,2,-**  ,10.  We  also  computed  the  mean 
m  and  std  a  of  the  entire  BER.  To  see  which  value  -rrii  or  <7;-  varies  more,  we  normalized  the  mean 
and  std  of  each  segment  using  mi/m,  and  <7i/cr,  and  we  then  computed  the  std  of  their  normalized 
values,  <jm  and  (Jstd- 

As  we  see  from  the  last  row  of  Tables  I,  am  <c  crstd-  We  conclude,  therefore,  that  if  the  BER  of 
each  segment  (short  range) are  Gaussian  with  uncertain  standard  deviation.  One  example  of  type-2 
Gaussian  MF  with  uncertain  standard  deviation  is  shown  in  Fig.3. 
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Table  1:  Mean  and  std  values  for  ten  segments  and  the  entire  BER,  and  their  normalized  std. 


BER 

mean 

std 

Segment  1 

0.016613 

0.033315 

Segment  2 

0.015618 

0.027857 

Segment  3 

0.015528 

0.017401 

Segment  4 

0.016206 

0.02107 

Segment  5 

0.015721 

0.017148 

Segment  6 

0.016298 

0.029309 

Segment  7 

0.017062 

0.037428 

Segment  8 

0.016253 

0.022871 

Segment  9 

0.016448 

0.023194 

Segment  10 

0.016237 

0.020675 

Entire  Traffic 

0.016198 

0.025829 

Normalized  std 

0.029161 

0.26184 

4.2  Analyzing  and  Modelling  MAC  Layer  Service  Time 

Recent  research  by  Zhai,  kwon  and  Fang  [7]  discovered  that  the  lognormal  distribution  could  match 
for  the  MAC  layer  service  time,  i.e.,  if  the  MAC  layer  service  time  for  the  packet  i  is  Si,  then 

logiosi  ~  A/"(-;m,cr2)  (10) 

We,  therefore,  tried  to  model  the  logarithm  of  the  MAC  layer  service  time,  to  see  if  a  Gaussian 
MF  can  match  its  nature.  We  decomposed  the  whole  data  sets  into  ten  segments  and  computed 
the  mean  m*  and  std  <7i  of  the  logarithm  of  the  MAC  layer  service  time  of  the  ith  segment, 
i  =  1, 2,  •  •  •  ,10.  We  also  computed  the  mean  m  and  std  cr  of  the  entire  logarithm  of  the  MAC  layer 
service  time.  To  see  which  value  -mi  or  varies  more,  we  normalized  the  mean  and  std  of  each 
segment  using  mi/m ,  and  crj/cr,  and  we  then  computed  the  std  of  their  normalized  values,  am  and 


10 


1 
o.s 
0.8 
0.7 
0.8 
0.8 
0.4 

o.: 

OJ 
0/ 

4 

Figure  3:  Type-2  Gaussian  MF  with  uncertain  standard  deviation 

G std* 

As  we  see  from  the  last  row  of  Tables  II,  am  <7 std-  We  conclude,  therefore,  that  if  the 
logarithm  of  MAC  layer  service  time  of  each  segment  (short  range)  axe  Gaussian  with  uncertain 
standard  deviation,  as  shown  in  Fig.3. 

5  Cross-layer  Design  Using  Interval  Type-2  Fuzzy  Logic  System 

As  we  introduce  in  the  preliminaries,  the  high  BER  means  high  packets  loss  rate.  Requests  for 
resends  will  increase  latency.  For  delay  insensitive  traffic  requires  a  very  low  BER.  And  the  MAC 
later  service  time  is  important  when  we  examine  the  performance  of  higher  protocol  layers.  So  we 
could  know  BER  and  MAC  layer  service  time  will  manage  the  packet  transmission  delay  between 
the  mobile  nodes.  We  are  now  ready  to  evaluate  the  packet  transmission  delay  using  interval  type-2 
fuzzy  logic  systems. 

We  predict  packet  transmission  delay  based  on  the  following  two  antecedents: 

1.  Antecedent  1.  BER. 

2.  Antecedent  2.  MAC  layer  service  time. 

The  consequent  is  depicted  as  the  packet  transmission  delay.  The  linguistic  variables  used  to 
represent  the  BER  and  MAC  layer  service  time  were  divided  into  three  levels:  low,  moderate,  and 
high.  The  consequents  -  the  packet  transmission  delay  were  divided  into  5  levels,  vert  low ,  low, 
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Table  2:  Mean  and  std  values  for  ten  segments  and  the  entire  logarithm  of  MAC  layer  service  time, 
and  their  normalized  std. 


MAC  layer  service  time 

mean 

std 

Segment  1 

-1.1902 

0.44295 

Segment  2 

-1.1929 

0.44698 

Segment  3 

-1.1967 

0.45237 

Segment  4 

-1.1959 

0.44835 

Segment  5 

-1.1917 

0.43598 

Segment  6 

-1.1924 

0.44779 

Segment  7 

-1.1976 

0.45687 

Segment  8 

-1.1996 

0.45554 

Segment  9 

-1.1923 

0.45068 

Segment  10 

-1.1997 

0.462 

Entire  Traffic 

-1.1949 

0.44981 

Normalized  std 

0.0028746 

0.016421 

moderate,  high  and  very  high. 

We  designed  questions  such  as: 

IF  BER  is  low  and  MAC  layer  service  time  is  high,  THEN  the  packet  transmission  delay  is 

So  we  need  to  set  up  32  =  9  (because  every  antecedent  has  3  fuzzy  sub-sets,  and  there  axe’  2 
antecedents)  rules  for  this  FLS.  We  summarized  these  rules  in  Table  II. 

We  used  Guassian  membership  functions  (MFs)  to  represent  the  antecedents  and  the  conse¬ 
quent. 

Fig.4  show  the  FLS  application  for  the  cross-layer  design. 


12 


Table  3:  Fuzzy  Rules  and  Consequent 


Consequent 

Low 

Low 

Very  Low 

Low 

Moderate 

Low 

Low 

High 

Moderate 

Moderate 

Low 

Low 

Moderate 

Moderate 

Moderate 

Moderate 

High 

High 

High 

Low 

Moderate 

High 

Moderate 

High 

High 

High 

Very  High 

When  a  mobile  node  sends  out  a  packet,  it  will  first  predict  the  packet  transmission  delay  using 
the  FLS  algorithm.  After  that,  the  node  could  adjust  the  transmission  power  according  to  the 
predicted  packet  transmission  delay.  Therefore  average  delay,  energy  consumption  and  throughput 
performances  will  change. 

6  Simulations 

We  implemented  the  simulation  model  using  the  OPNET  modeler.  The  simulation  region  is 
300x300  meters.  There  were  12  mobile  nodes  in  the  simulation  model,  and  the  nodes  were  roaming 
independently  with  variable  ground  speed  between  0  to  10  meters  per  second.  The  mobility  model 
was  called  one-step  Markov  path  model.  The  movement  would  change  the  distance  between  mobile 
nodes.  We  assumed  the  collecting  data  distribution  of  the  mobile  node  was  exponential  distribution 
and  the  arriving  interval  was  0.2  second  and  the  length  of  the  packet  is  512  bits. 

For  type-1  FLS,  We  chose  Gaussian  membership  function  as  antecedents;  for  interval  type-2 
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Figure  4:  FLS  application  for  cross-layer  design 

FLS,  we  used  Gaussian  primary  MF’s  with  fixed  mean  and  uncertain  std  for  the  antecedents.  The 
steepest  decent  algorithm  was  used  to  train  all  the  parameters  based  on  the  300  data  sets.  After 
training,  the  rules  were  fixed,  and  we  tested  the  FLS  based  on  the  remaining  300  data  sets. 

In  Fig.5,  we  summarized  the  root-mean-square-errors  (RMSE)  between  the  estimated  packet 
transmission  delay  and  the  actual  delay. 


RMSE  = 


600 


i=301 


(11) 


where  d(i)  was  the  actual  packet  transmission  delay  and  f(i)  was  the  estimated  delay. 

The  simulation  result  shows  that  the  interval  type-2  FLS  for  packet  transmission  delay  analysis 
and  prediction  outforms  the  type-1  FLS. 

In  the  following  performance  analysis,  we  assume  we  could  know  the  actual  transmission  delay. 
We  just  use  it  as  a  idea  case  and  get  the  performance  parameters  as  the  bounds. 


6.0.1  Average  Latency 

We  used  the  average  latency  parameter  to  evaluate  the  network  performance.  Each  packet  was 
labeled  a  timestamp  when  it  was  generated  by  the  source  sensor  node.  When  its  destination  sensor 
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Figure  5:  The  RMSE  of  packet  transmission  delay  prediction  for  two  FLS  approaches 


node  received  it,  the  time  interval  was  the  transmission  delay. 


Average  Latency  = 


(12) 


Fig.6  showed  the  latency  performance  of  the  three  algorithms.  The  type2  algorithm  was  better 
than  the  typel  algorithm.  The  type2  predictor  could  reduce  the  average  delay  by  up  to  20%  than 
typel  predictor.  And  the  idea  case  was  the  best  performance  among  the  three. 


Figure  6:  Average  Delay  for  Three  Algorithms 


6.1  Energy  Efficiency 

It  was  not  convenient  to  recharge  the  battery,  so  the  energy  efficiency  was  extremely  important 
for  mobile  ad  hoc  networks.  In  the  wireless  mobile  ad  hoc  networks,  we  used  the  parameter:  the 
remaining  energy  to  describe  the  energy  efficiency. 
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Figure  7:  Remaining  Energy  for  Three  Algorithms 

Fig. 7  showed  the  remaining  energy  of  the  three  algorithms.  We  assumed  that  the  energy  of 
each  sensor  is  2.0J  and  we  adopted  CSMA/CA  protocol  to  solve  the  packets  collision  problem.  If 
a  sensor  node  transmitted  Nums  packets  (each  packet  cost  1  second)  and  receives  Numr  packets 
(each  packets  also  cost  1  second)  and  it  was  roaming  in  the  network  for  Tm,  we  could  get  the 
remaining  energy  2%  of  this  sensor  node: 

J Si  =  2.0  -  (3  x  10-5  xTm  +  1.2  x  10~3  x  1  +  6  x  10"4  x  1)  (13) 

Same  as  the  average  delay,  for  the  performance  of  the  energy  consumption,  the  type2  algorithm 
was  better  than  the  typel  algorithm.  The  type2  predictor  could  reduce  the  energy  consumption 
by  up  to  21%  than  the  typl  predictor.  The  idea  case  was  set  as  the  low  bound. 

6.2  Networks  Efficiency 

The  mobile  ad  hoc  networks  were  used  to  collect  data  and  transfer  packets.  The  throughput  of 
packets  transmitted  was  one  of  the  parameters  to  evaluate  the  networks  efficiency.  In  our  simulation, 
we  assumed  the  collecting  data  distribution  of  the  mobile  node  was  Poisson  distribution  and  the 
arriving  interval  was  0.2  second. 

Observing  from  Fig.8,  the  type2  algorithm  was  better  than  the  typel  algorithm.  The  type2 
predictor  could  increase  the  throughput  by  up  to  45%  than  the  typl  predictor.  And  the  idea  case 
was  set  as  the  high  bound. 
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Figure  8:  Throughput  for  Three  Algorithms 

We  introduce  the  fuzzy  logic  system  in  the  cross-layer  design.  Compare  with  other  algorithms 
for  cross-layer  design,  the  fuzzy  method  could  be  flexible  and  simpler  to  implement.  We  could 
predict  the  packet  transmission  delay  according  to  the  information  just  from  physical  layer  and 
mac  layer.  So  we  have  potential  application  advantage.  We  use  the  FLSs  as  the  predictors  and  we 
could  control  the  transmission  power  according  the  outcomes  of  the  predictors.  Simulation  results 
show  that  the  type2  algorithm  is  better  than  the  typel  algorithm.  And  we  could  set  the  idea  case 
as  the  performance  bounds. 

7  Conclusion 

Cross-layer  design  is  a  effective  method  to  improve  the  performance  of  the  mobile  ad  hoc  network. 
We  apply  the  fuzzy  logic  system  to  combine  physical  layer  and  data-link  layer  together.  We  select 
BER  and  MAC  layer  service  time  as  antecedents  to  analyze  and  predict  the  packet  transmission 
delay.  And  we  apply  a  type-1  FLS  and  an  interval  type-2  FLS  for  the  packet  transmission  delay 
analysis  and  prediction.  Simulation  result  shows  that  the  interval  type-2  FLS  for  packet  trans¬ 
mission  delay  analysis  and  prediction  outform  the  type-1  FLS.  We  use  the  FLSs  as  the  predictors 
and  we  could  control  the  transmission  power  according  the  outcomes  of  the  predictors.  Simulation 
results  show  that  the  type2  algorithm  is  better  than  the  typel  algorithm.  And  we  could  set  the 
idea  case  as  the  performance  bounds. 
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Abstract — Ultra  wideband  (UWB)  technology  offers  unique 
advantages  for  wireless  communications:  precise  location-timing 
capabilities,  low  power,  low  complexity,  and  low  cost.  No  ex¬ 
isting  wireless  network  successfully  takes  advantage  of  those 
properties  of  this  technology  because  of  the  lack  of  an  efficient 
medium  access  control  (MAC)  technology.  In  this  paper,  we 
propose  an  energy-efficient  MAC  protocol:  asynchronous  MAC 
protocol  for  UWB  communications  (A-MAC-UWB).  Basing  on 
the  characteristics  of  UWB  communication,  we  utilize  virtual 
MIMO  technology  to  increase  the  data  rate,  and  substitute  space 
diversity  for  time  diversity  to  improve  system  performance  in 
terms  of  energy  efficiency  and  bit  error  rate  (BER).  Also,  we 
implement  multiuser  access  through  ALOHA  scheme,  instead  of 
mutual  exclusion  method  such  as  TDMA  and  random  access.  For 
multiuser  interference,  we  set  a  model  to  adaptively  adjust  the 
data  rate  to  ensure  certain  signal  to  noise  ratio  (SNR)  at  receiver 
side,  since  a  Shanoncapacity  of  a  multipath  fading  additive  white 
Gaussian  noise  (AWGN)  wideband  channel  is  a  linear  function 
of  SNR.  For  optimum  design  for  power  on/off  phase  duration, 
we  consider  the  traffic  whose  arrival  interval  follows  heavy  tailed 
distribution,  instead  of  Poisson  distribution.  Based  on  that,  we 
acquire  the  probability  density  function  (pdf)  for  power  off  phase 
duration  for  our  algorithm.  Compared  with  our  previous  work, 
we  try  to  find  a  better  method  to  trade  off  between  data  packet 
latency  and  energy  reservation. 

I.  Introduction 

A  wireless  sensor  network  (WSN)  can  be  thought  as  an 
ad  hoc  network  consisting  of  sensor  nodes  linked  by  a 
wireless  medium  to  perform  distributed  sensing  tasks.  Recent 
developments  in  integrated  circuit  technology  have  allowed  the 
construction  of  low-cost  small  sensor  nodes  with  signal  pro¬ 
cessing  and  wireless  communication  capabilities.  Distributed 
WSNs  have  increasing  potential  applications  because  they  hold 
the  potential  to  revolutionize  many  segments  of  our  economy 
and  life,  from  environmental  monitoring  and  conservation,  to 
manufacturing  and  business  asset  management,  to  automation 
in  the  transportation  and  health-care  industries  [1]. 

According  to  Federal  Communications  Commission  (FCC), 
an  ultra-wideband  (UWB)  system  is  defined  as  any  radio 
system  that  has  a  10-dB  bandwidth  larger  than  20  percent 
of  its  center  frequency,  or  has  a  10-dB  bandwidth  equal  to 
or  larger  than  500  MHz  [2].  To  enable  the  deployment  of 
UWB  systems,  FCC  allocated  an  unlicensed  frequency  band 
3.1.10.6  GHz  for  indoor  or  hand-held  UWB  communication 
systems  [2].  UWB  is  an  attractive  technology  for  WSNs  due 


to  its  high  data  rate,  low  radiated  power,  and  accurate  ranging 
capability.  Two  different  UWB  communications  systems  - 
impulse-based  systems  and  multi-carrier  systems  -  have  been 
pursued  recently.  For  low  cost  and  low  power  applications, 
impulse  UWB  (I-UWB)  has  several  advantages  over  multi¬ 
carrier  systems  including  robustness  to  Rayleigh  fading  and 
simple,  low  power  hardware. 

Multiantenna  systems  have  been  studied  intensively  in 
recent  years  due  to  their  potential  to  dramatically  increase 
the  channel  capacity  in  fading  channels  [3],  It  has  been 
shown  [3]  that  multi-input-multi-output  (MIMO)  systems  can 
support  higher  data  rates  under  the  same  transmit  power  budget 
and  bit-error-rate  performance  requirements  as  a  single-input 
single-output  (SISO)  system.  An  alternative  view  is  that  for 
the  same  throughput  requirement,  MIMO  systems  require  less 
transmission  energy  than  SISO  systems.  However,  direct  appli¬ 
cation  of  multiantenna  techniques  to  sensor  node  impractical 
due  to  the  limited  physical  size  of  a  sensor  node  which 
typically  can  only  support  a  single  antenna.  In  resent  years, 
virtual  MIMO  conception  have  been  proposed  by  Cui  [4]  and 
Jayaweera  [5],  which  allow  individual  single-antenna  nodes 
to  cooperate  on  information  transmission  and/or  reception. 
A  cooperative  MIMO  system  can  be  constructed  such  that 
energy-efficient  MIMO  schemes  can  be  deployed. 

In  this  paper,  we  propose  an  energy-efficient  MAC  protocol: 
asynchronous  MAC  protocol  for  UWB  communications  (A- 
MAC-UWB).  Basing  on  the  characteristics  of  UWB  commu¬ 
nication,  we  utilize  virtual  MIMO  technology  to  increase  the 
data  rate,  and  substitute  space  diversity  for  time  diversity  to 
improve  system  performance.  The  structure  of  corresponding 
transmitter  and  receiver  are  given  in  this  paper.  Also,  we 
implement  multiuser  access  through  ALOHA  scheme,  instead 
of  mutual  exclusion  method  such  as  TDMA  and  random 
access.  For  multiuser  interference,  we  set  a  model  to  adaptively 
adjust  the  data  rate  to  ensure  certain  SNR  at  receiver  side, 
since  a  Shanoncapacity  of  a  multipath  fading  additive  white 
Gaussian  noise  (AWGN)  wideband  channel  is  a  linear  function 
of  SNR.  For  optimum  design  for  power  on/off  phase  duration, 
we  consider  the  traffic  whose  arrival  interval  follows  heavy 
tailed  distribution,  instead  of  Poisson  distribution.  Based  on 
that,  we  acquire  the  probability  density  function  (pdf)  for 
power  off  phase  duration  for  our  algorithm.  Compared  with 


our  previous  work,  we  try  to  find  a  better  method  to  trade  off 
between  data  packet  latency  and  energy  reservation. 

The  remainder  of  this  paper  is  organized  as  follows.  In  next 
section  (Section  III)  we  make  a  formulation  on  the  problems 
covered  in  this  paper.  Assumption  and  modelling  related  to 
our  algorithm  is  given  in  Section  IV.  Section  V  describe  our 
A-MAC-UWB  algorithm. 

II.  Related  Work 

In  contrast  to  typical  WLAN  protocols,  MAC  protocols 
designed  for  WSNs  usually  trade  off  performance  (latency, 
throughput,  fairness)  for  cost  (energy  efficiency,  reduced 
algorithmic  complexity).  The  main  idea  of  energy-efficient 
MAC  protocols  for  narrowband  systems  is  that  sensor  nodes 
intelligently  power  off  users  that  are  not  actively  transmitting 
or  receiving  packets.  The  goal  is  implementing  information 
exchange,  as  well  as  reducing  energy  consumption  to  extend 
the  lifetime  of  networks.  Narrowband  energy-efficient  MAC 
protocols  for  WSNs  can  be  classified  into  three  main  cate¬ 
gories  according  to  the  strategies  applied  for  channel  access: 

•  Contention-Based  Protocols:  802. 1 1  [6]  standard  is  based 
on  carrier  sensing  (CSMA)  and  collision  detection 
(through  acknowledgements). 

•  Slotted  Protocols:  traffic-adaptive  medium  access 
(TRAMA)  [7]  employs  a  traffic  adaptive  and  distributed 
election  scheme  to  allocate  the  system  time  among 
sensor  nodes.  Other  TDMA-based  energy-efficient  MAC 
protocols  like  EMACS,  bit-map-assisted  (BMA)  and 
GANGS  MAC  protocols  are  described  in  [8],  [9],  [10], 

•  TDMA-Based  Protocols:  S-MAC  [11]  is  a  low  power 
RTS-CTS  protocol  for  WSNs  inspired  by  PAMAS  [12] 
and  802.11.  T-MAC  [13]  improves  on  S-MAC’s  energy 
usage  by  using  a  very  short  listening  window  at  the 
beginning  of  each  active  period.  B-MAC  [14]  provides 
a  flexible  interface  to  obtain  ultra  low  power  operation, 
effective  collision  avoidance,  and  high  channel  utilization. 

Multiple  access  communications  employing  pulsed  UWB 
technologies  has  drawn  significant  research  interest.  The  MAC 
should  be  specifically  conceived  for  the  UWB  radio  physical 
layer,  and  as  such  foresee  and  eventually  optimize  strategies 
for  power  sharing  and  management.  Various  multiple  access 
schemes  and  their  performance  have  been  reported  in  the 
literature  [15]  [16].  Time  hopping  (TH)  has  been  found  to  be  a 
good  multiple  access  technique  for  pulsed  UWB  systems  [16]. 
Direct  sequence  (DS)  spreading  is  also  an  attractive  method 
for  multiple  access  in  UWB  systems.  E  Cuomo  et  al.  [17] 
outlined  key  issues  to  design  a  multi  access  scheme  based  on 
UWB.  They  selected  a  distributed  mechanism  to  handle  radio 
resource  sharing,  and  presented  a  general  framework  of  radio 
resource  sharing  to  the  UWB  wireless  ad  hoc  network  systems. 
J.  Ding  et  al  [18]  studied  the  impact  of  the  channel  acquisition 
time  with  different  MAC  protocols:  a  centralized  TDMA  and 
a  distributed  CSMA/CA. 

Information-theoretic  results  in  [19]  and  [20]  show  that 
a  Shannoncapacity  of  a  multipath  fading  AWGN  wideband 


channel  is  a  linear  function  of  signal  to  noise  ratio  (SNR). 
That  is: 

R  =  K  x  SNR  (1) 

Thus,  for  a  given  desired  bit-error  rate  on  the  link,  an  effi¬ 
cient  wideband  physical  layer  implementation  should  have  a 
linear  rate  function  within  the  operational  interval  of  SNRs. 
Moreover  UWB  is  flexible  in  the  reconfiguration  process  of 
data  rate  and  power,  due  to  the  availability  of  a  number  of 
transmission  parameters  which  can  be  tuned  to  better  match 
the  requirements  of  a  data  flow.  Therefore,  UWB  systems  can 
support  multiple  access  much  better  than  narrowband  systems. 

III.  Problem  Formulation 

The  biggest  challenge  for  designers  of  WSNs  is  to  develop 
systems  that  will  run  unattended  for  years.  This  calls  for  not 
only  robust  hardware  and  software,  but  also  lasting  energy 
resources.  However,  the  current  generation  of  sensor  nodes  is 
battery  powered,  whose  available  energy  is  limited,  and  replac¬ 
ing  or  recharging  batteries,  in  many  cases,  may  be  impractical 
or  uneconomical.  Lifetime  is  a  major  constraint.  Even  though, 
future  generations  can  be  powered  by  ambient  energy  sources 
(sunlight,  vibrations,  etc.)  [21],  the  current  provided  is  very 
low.  Energy  consumption  is  heavily  constrained.  From  both 
perspectives,  protocols  and  applications  designed  for  WSNs 
should  be  highly  efficient  and  optimized  in  terms  of  energy. 

Energy-efficient  communication  techniques  typically  focus 
minimizing  the  transmission  energy  only,  which  is  reasonable 
in  long-range  applications  where  the  transmission  energy 
dominant  in  the  total  energy  consumption.  However,  in  short- 
range  applications  such  as  WSNs  where  the  circuit  energy 
consumption  is  comparable  to  or  even  dominates  the  trans¬ 
mission  energy  [4],  And  Cui  [4]  claims  that  the  traditional 
belief  that  MIMO  systems  are  more  energy-efficient  than  SISO 
system  in  Rayleigh-fading  channel  is  misleading  when  both 
the  transmission  energy  and  the  circuit  energy  consumption 
are  considered  in  short-range  applications,  except  that  con¬ 
stellation  size  is  optimized. 

In  UWB  WSNs,  if  using  multiple  access  instead  of  mutual 
exclusion  access  methods,  one  main  energy  wasting  source  is 
idle  for  waiting  for  next  arrival  data  packet,  we  call  it  inter¬ 
packet  idle  listening,  which  is  caused  by  burst  traffic.  Besides 
this,  we  found  that  there  is  another  kind  of  idle  listening  for 
UWB  systems,  we  call  it  inter-symbol  idle  listening.  As  shown 
in  Section  IV-B,  each  bit  is  repeated  Ns  times,  during  each 
frame  duration  ( Tf )  only  one  pulse  is  sent  and  the  left  time  is 
idle  for  waiting  for  next  pulse,  and  there  are  Nh  bins  during 
one  frame  time.  In  this  case,  for  one  information  bit,  the  ratio 
of  actual  transmission/receiption  time  to  one  bit  duration  is  -fj 
,  and  the  ratio  of  inter-symbol  idle  time  to  one  bit  duration 
is  (Nh~1)Tc .  The  typical  values  for  those  key  parameters  are: 
Tf  =  LOOns,  Tc  =  0.75ns,  Ns  =  100,  and  Nh  =  100.  Note 
that  there  is  about  74.25%  time  is  idle  for  one  information  bit 
transmission. 

Furthermore,  compared  with  narrowband  communication 
systems,  UWB  is  low  power  consumption.  The  research  in 
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[10]  points  out  that  a  bit  rat  of  100 Kpbs  over  5  meters  with 
no  more  than  1  mW  power  consumption.  Therefore,  saving 
energy  on  circuit  part  is  one  of  the  most  effective  methods 
for  UWB  system  than  for  narrowband  system.  Hence,  how 
to  extend  the  superiority  of  virtual  MIMO  systems  in  terms 
of  energy  efficiency  down  to  very  short  distance  for  UWB 
systems  is  one  of  objects  of  this  paper. 

IV.  Assumption  and  Modelling 

A.  Network  Model 

A  commonly  encountered  distributed  wireless  sensor  net¬ 
work  model  consists  of  a  lead-sensor  and  a  set  of  data  collec¬ 
tion  nodes.  In  this  model,  a  number  of  low-end  data  collection 
sensors  are  connected  with  a  high-end  data  gathering  node 
which  may  act  as  lead-sensor  or  a  fusion  center  over  a  wireless 
link.  In  such  networks,  the  data  collection  sensors  are  typically 
subjected  to  strict  energy  constraints  while  the  data  gathering 
node  is  not.  The  data  collection  nodes  collect  data  on  a 
physical  phenomenon  that  is  of  interest  and  communicate  them 
to  the  data  gathering  node  over  a  wireless  link  which  performs 
required  joint  processing.  Suppose  a  set  of  data  collection 
nodes  ( Nt )  (possibly  close  to  each  other)  has  data  to  be 
sent  to  the  data  gathering  node.  All  these  sensors  transmit 
their  data  simultaneously  to  the  data  gathering  node  as  in 
a  conventional  VBLAST  system[22].  Which  sets  of  nodes 
can  transmit  simultaneously  will  be  designed  next  section. 
We  assume  that  there  are  NR  -  1  number  of  local  sensors 
surrounding  the  data  gathering  node  which  are  willing  to  assist 
it  in  realizing  a  virtual  receiver  antenna  array  of  size  Nr 
(including  the  data  gathering  node  itself).  Each  of  these  Nr 
sensor  nodes  receive  transmissions.  The  Nr-  1  assisting  nodes 
quantize  their  received  signal  samples  and  re-transmit  these 
bits  to  the  data  gathering  node. 

B.  Physical  Layer  Model 

The  UWB  physical  model  of  the  network  on  which  the 
design  of  our  protocol  is  based  is  discussed  in  this  section. 
The  most  common  and  traditional  way  of  emitting  an  UWB 
signal  is  by  radiating  pulses  that  are  very  short  in  time.  IR 
transmits  extremely  short  pulses  giving  rise  to  wide  spectral 
occupation  in  the  frequency  domain  (bandwidth  from  near 
dc  to  a  few  gigahertz).  The  way  by  which  the  information 
data  symbols  modulate  the  pulses  may  vary.  Pulse  Position 
Modulation  (PPM)  and  Pulse  Amplitude  Modulation  (PAM) 
are  commonly  adopted  modulation  schemes  [24]  [25],  In 
addition  to  modulation  and  in  order  to  shape  the  spectrum 
of  the  generated  signal,  the  data  symbols  are  encoded  using 
pseudorandom  or  pseudonise  (PN)  codes.  Fig.  1  reports  an 
example  of  transmission  by  two  users,  each  characterized  by 
a  TH  code  word. 

where  T/  is  the  frame  duration.  Tc  is  the  bin  duration. 

The  output  xn  (n  =  1, 2,  ■  ■  •  ,  Nt)  at  n  -th  user’s  output 
can  be  expressed  as  follows: 

x(n)  (f)  =  -  c(ra)Tc  -  a<">e)  (2) 


Fig.  1.  UWB  physical  layer  with  PPM,  the  model  of  Win-Scholtz  [16] 


where  E^x  is  the  transmitted  energy  per  pulse  for  n  —  th 
user  at  transmitter  antenna.  Note  that  the  bit  interval,  or  the 
bit  duration,  that  is,  the  time  used  to  transmit  one  bit  T], 
is:  Tb  =  Ts.  Compared  with  generic  TH-PPM  UWB  signal 
transmitters,  such  as  the  one  used  in  Win-Scholtz  physical 
model  [16],  in  which  7\  =  NSTS  if  also  introducing  Ns 
redundancy,  our  MIMO  TH-PPM  UWB  transmitter  improves 
the  data  rate  Ns  times,  which  is  one  of  advantages  introduced 
by  utilizing  MIMO  technology. 

The  signal  received  by  i  —  th  (i  =  1,2, , Nr)  user’s 
antennas  is  (£)  at  time  t,  written  as  follows: 

Nt 

z^(t)  =  y^a^(t)x^(t  -  Tj **)  +  u)l(t )  +  nw(f)  (3) 
i- i 

where  (t)  is  the  channel  coefficient  for  the  j  —  th  transmit¬ 
ter  at  i  -  th  receiver,  is  the  delay  of  the  j  -  th  transmitter 
at  i  -  th  receiver.  w*(f)  is  the  multiuser  interference.  n\n\t) 
is  AWGN  noise. 

Combining  (2)  and  (3),  we  drive  that: 

Nt  - - — 

z^(t)  =  \f  E^]ca^\t)p{t-c^Tc-a^'>e-T<f>)+n<'l\t) 

7=1 

(4) 

V.  Proposed  Asynchronous  MAC  Protocol  for 
UWB  Communications  (A-MAC-UWB)  Design 

A-MAC-UWB  divides  system  time  into  four  phases:  PRFR- 
Phase,  Schedule-Phase,  On-Phase  and  Off-Phase  (Fig.  2). 

•  PRFR-Phase  is  preserved  for  normal  nodes  to  exchange 
Traffic-Rate  &  Failure-Rate  (TRFR)  messages  and  data 
packets; 

•  Schedule-Phase  is  preserved  for  cluster  heads  to  locally 
broadcast  phase-switching  schedules; 

•  Off-Phase  is  preserved  for  all  normal  nodes  to  power  off 
their  radios.  In  this  phase,  there  is  no  communication,  but 
data  storing  and  sensing  may  happen; 

•  On-Phase  is  preserved  for  all  normal  nodes  to  power  on 
their  radios  to  carry  on  communication. 

In  A-MAC-UWB,  according  to  information  collected  from 
normal  nodes,  cluster  heads  estimate  the  influence  of  clock 
drifts  on  communications  and  the  capacity  to  buffer  packets 
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Fig.  2.  Time  Scheme  Structure  for  A-MAC-UWB 


within  their  region.  Then  cluster  heads  choose  the  power  on/off 
duration  and  the  interval  for  schedule  broadcast.  Finally,  nodes 
set  up  their  own  phase-switching  schedules  to  power  on  their 
radios  for  carrying  on  communication  and  to  power  off  their 
radios  for  saving  energy  alternately. 

A.  Essential  Parameter  Design 

1)  Off-Phase  Duration  (Tf):  It  is  now  recognized  [26]  [27] 
that  traffic  in  wired  and  wireless  communication  networks  is 
better  described  by  heavy-tailed  distributions  than  by  Poisson, 
Gaussian  or  other  classical  distributions  with  exponentially 
decreasing  tails.  In  this  paper,  we  model  network  arrivals 
as  Pareto  distribution,  a  heavy-tailed  distribution,  instead  of 
Poisson  distribution  as  in  [28]  did.  The  probability  mass 
function  is  given  in  (5). 

f(x)  =  akax~a~1,  1  <  a  <  2,  k  >  0,  x>k  (5) 
and  its  cumulative  distribution  function  is  given  by: 

F(x)  =  1  -  (V  (6) 

x 

where  k  represents  the  smallest  value  the  random  variable  can 
take. 

In  this  case,  we  establish  an  embedded-Markov  chain  to 
express  the  packet  arrive  process  for  each  user.  N(t)  is  the 
number  of  data  packets  in  buffer  at  time  t.  N~  stands  for 
the  queue  length  when  the  n  —  th  data  packet  arrival  (the 
current  arrival  data  packet  not  included).  Even  though  the 
queue  length  of  each  user  does  not  own  Markov  property  any 
more,  N~,n  >  0  forms  a  Markov  chain,  an  embeded-Markov 
chain.  The  average  arrival  interval  (j)  is  given: 

1  r°°  rrk 

T  =  /  xdF{  x)  =  -p-  (7) 

a  Jk  a  —  1 

During  Off-Phase,  there  are  about  (Ty,  x  A;)  data  packets 
arrived  at  node  i.  We  assume  the  buffer  size  for  node  i  is 
Bs.  Then  the  duration,  denoted  by  U,  within  which  node  i’s 
buffer  can  be  fully  filled  with  arrived  data  packets  is  given  by 
U  =  j*.  Considering  the  first  criteria,  Tfti  for  node  i  should 
not  longer  than  i*.  In  this  algorithm,  we  let 

=  =  (8) 

Oii 

We  assume  Bs  is  a  constant,  a,  =  a  for  all  users.  While 
ki  follows  a  uniform  distribution  at  range  [0,  k*].  Note  that, 
the  cumulative  density  function  for  Ty,  is  given: 

Ft, Mi)  =  p{Tf.i  <  tfA  =  FK(tfAQ 1})  (9) 


Since 

FK{k)  =  ±  (10) 


then 


tjAff  - 1) 

ak*Bs 


(11) 


Since,  within  a  cluster,  there  are  multiple  nodes  which  have 
various  traffic  arrival  rate,  the  duration  for  all  nodes  will  not 
be  equal.  If  we  let  the  Off-Phase  duration  for  a  whole  cluster 
Tfjot  equal  to  i-th  user,  that  is  T/itot  =  T/ti.  Moreover,  since 
a  new  arrival  to  an  idle  system,  rather  than  going  into  service 
immediately,  waits  for  the  end  of  the  vacation  period,  and 
arrivals  are  served  following  a  first-come-first-in  order.  There¬ 
fore,  the  longer  T/,tot  is,  the  longer  for  data  packets  waiting 
at  buffer  for  transmission  is.  We  leverage  the  GI*/G/l  with 
vacation  modelto  model  our  system.  Through  analysis,  we  try 
to  get  the  relationship  between  average  waiting  time  ( Wj )  for 
j  -  th  user  and  Tff,  tot),  that  is  Wj  =  fj(Tf,tot)- 

Based  on  this  conclusion,  we  try  to  get  the  probability  (p3) 
for  data  packets  out  of  date  for  j—th  when  Off-Phase  duration 
equals  to  T/,tot  is  given  in  (13). 


Pj  =  P{Wj  >  Wmax}  =  1  -  FTf  tot  (f-'iWmax))  (12) 
Since  T/,tot  =  7/,i,  (13)  is  rewritten  as  follows: 

Ptj  =  P{Wij  >  Wmax}  =  1  -  Ft, (Wmax))  (13) 


where  py  is  the  data  packets  out  of  date  probability  for  j  -  th 
user  when  letting  i-th  user’s  Off-Phase  duration  for  the  whole 
cluster. 

For  a  system  with  a  Off-Phase  duration  Tfti  and  total  data 
packets  out  of  date  probability  HjPj,  we  represent  its  objective 
function  as 

arg max  J (Tfti)  =  arg ma x{/3Tfti  -7^ py }  (14) 

j 

where  (3  and  7  are  systems  parameters  that  respectively 
represent  the  “latency  constant”  and  the  “penalty  constant” 
and  can  be  tuned  to  achieve  the  desired  trade-off  between 
maximizing  energy  reservation  period  and  minimizing  buffer 
overflowing  rate. 

2)  On-Phase  Duration  (Tn):  During  On-Phase,  normal 
nodes  start  to  send  data  packets  through  competition.  Users, 
who  have  data  packets  to  send,  access  the  channel  to  make 
communication.  Competing  sources  are  allowed  to  send  con¬ 
currently.  Our  A-MAC-UWB  protocol  does  not  use  mutual 
exclusion  (as  is  commonly  done  by  random  access  or  TDMA 
protocols)  but,  in  contrast,  allows  interference  to  occur  and 
adapt  to  it.  The  detail  on  working  process  will  be  discussed  in 
the  next  part.  One  of  the  advantages  for  our  algorithm  is  re¬ 
moving  the  overhead  of  control  packets  for  carrier  sensing  for 
avoiding  collision,  such  as  RTS/CTS  for  CSMA/CA  scheme, 
but  also  ensuring  successful  transmission. 

Based  on  our  transmitter  and  receiver  design,  each  informa¬ 
tion  bit  will  be  received  out  by  Nr  nodes  almost  concurrently 
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during  period  Ts,  the  transmitted  signal  is  given  in  (2).  And 
the  estimation  on  receiving  signals  are  done  by  combining 
all  received  signal  on  Nr  receiver  nodes.  The  received  signal 
at  i  —  th  user  has  be  given  in  (4).  In  that  equation,  besides 
the  desired  user’s  signal  part,  w^(t)  represents  the  multiple 
access  interference  (MAI)  caused  by  other  users,  and  nW  (t) 
is  the  AWGN  noise. 

Our  physical  layer  design  scheme  takes  the  advantages 
both  of  virtual  MIMO  and  UWB  technologies.  Through  us¬ 
ing  MIMO,  we  substitutes  space  diversity  for  time  diversity 
which  is  commonly  used  for  traditional  TH-PPM-UWB  signal 
generation,  to  carry  out  performance  improvement  task.  More¬ 
over,  the  diversity  is  increased  from  Ns  to  Nt  x  Nr.  This 
modification  is  inspired  by  the  close  dependence  on  high  time 
synchronization  among  users.  In  that  case,  common  timescale 
should  be  set  up  among  user  to  guarantee  signal  orthogonal 
among  users,  which  is  utilized  to  mitigate  MAI  for  the  source 
and  destination  nodes.  While  for  our  algorithm,  each  user  just 
focuses  on  the  working  sequence  of  its  transmitter  antennas 
without  being  aware  of  the  time  difference  with  other  users. 

The  optimum  detection  strategy  for  this  multiple-access 
system  leads  to  a  multiuser  receiver  which,  however,  is  too 
complex  to  implement  and  power-consumed.  More  feasible 
schemes  are  of  interest.  The  simplest  suboptimal  receiver 
is  obtained  making  two  approximations.  First,  the  MAI  is 
thought  of  as  a  white  Gaussian  process  [16].  The  Gaussian 
approximation  is  justified  by  the  central  limit  theorem  if 
the  users  are  many  and  have  comparable  powers.  Second,  a 
dominant  path  exists  that  conveys  the  major  part  of  the  desired 
user’s  energy. 

We  try  to  formulate  the  relationship  between  Prb  and  SNIR, 
expressed  as  Prb  =  f(SNIR).  Since  SNIR  is  related  with 
those  parameters,  such  as  Ts,  Ns,  we  can  implement  the  task 
of  adaptively  adjusting  data  rate  Rb  with  various  value  of 
Prb  since  Rb  =  Considering  that  longer  Ts  and  bigger 
Ns  mean  more  energy  needed  to  achieve  expected  system 
performance,  i.e.,  data  successful  transmission  rate. 

If  we  let  the  duration  for  On-Phase  of  nodes  be  Tn,tot,  the 
total  number  of  data  packets  (Nt)  arrived  is  given  in  (15), 
since  the  traffic  arrival  process  is  independent  with  the  data 
transmission  process.  Generally,  there  are  two  parts  for  N{ : 
one  is  the  data  packets  arrived  during  Off-Phase,  denoted  by 
Nfti;  the  other  is  the  data  packets  arrived  during  On-Phase, 
denoted  by  Nnii. 

Ni  =  Nf}i  +  Nn,i  =  \i(Tf'tot  +  Tn,tot )  (15) 

In  our  On-Phase  duration  and  Off-Phase  duration  designing, 
we  not  only  try  to  extend  the  power  off  time  to  reserve  energy 
(through  more  idle  listening  avoided),  but  also  need  to  ensure 
data  packets  up  to  date.  Keeping  this  in  our  mind,  the  criteria 
for  active  duration  is  that  the  active  duration  (T„,i)  should  be 
long  enough  for  all  received  data  packets  to  be  sent  out.  Then 
we  have 


Rb,iTn,i  —  A i(Tnttot  +  Tf,tot )  (16) 


Solving  (16)  for  Rb>i  we  get 

r,  ^i(Tf,tot  +  Tn:tot)  rns 

Rb,i  —  rp  U  ') 

-*-n,tot 

From  other  aspect  of  obtaining  satisfied  data  successful 
transmission  rate  to  acquire  data  rate,  that  is  Rb,i  for  node  i, 
we  get  Rbti  =  K  x  f~1(Prb).  We  define  the  objective  function 
for  Tf  design  is: 

U(Tn>tot)  —  l-R&.t  —  -Ri.il  (18) 

i 

Then  the  optimum  task  is  shown  in  follows: 

arg  min  U  {Tn^ot)  (19) 
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Abstract — In  this  paper,  we  introduce  a  new  method 
for  packet  transmission  delay  analysis  and  prediction  in 
mobile  ad  hoc  networks.  We  use  fuzzy  logic  system  (FLS) 
to  coordinate  physical  layer  and  data  link  layer.  We 
demonstrate  that  a  type-2  fuzzy  membership  functions 
(MFs),  i.e.,  the  Gaussian  MFs  with  uncertain  variance  is 
most  appropriate  to  model  BER  and  MAC  layer  service 
time.  Two  FLSs:  a  singleton  type-1  FLS  and  an  interval 
type-2  FLS  are  designed  to  predict  the  packet  transmission 
delay  based  on  the  BER  and  MAC  layer  service  time. 
Simulation  result  shows  that  the  interval  type-2  FLS 
performs  better  than  the  type-1  FLS. 

I.  Introduction 

The  demand  for  Quality  of  Service  (QoS)  in  mobile 
ad  hoc  networks  is  growing  in  a  rapid  speed.  To  enhance 
the  QoS,  we  consider  the  combination  of  physical  layer 
and  data-link  layer  together,  a  cross-layer  approach.  A 
strict  layered  design  is  not  flexible  enough  to  cope  with 
the  dynamics  of  the  mobile  ad  hoc  networks  [1],  Cross¬ 
layer  design  could  introduce  the  layer  interdependencies 
to  optimized  overall  network  performance.  The  general 
methodology  of  cross-layer  design  is  to  maintain  the 
layered  architecture,  capture  the  important  information 
that  influence  other  layers,  exchange  the  information 
between  layers  and  implement  adaptive  protocols  and 
algorithms  at  each  layer  to  optimize  the  performance. 

Lots  of  previous  works  have  focused  on  cross-layer 
design  for  QoS  provision.  Liu  [2]  combine  the  AMC  at 
physical  layer  and  ARQ  at  the  data  link  layer.  Ahn  [3] 
use  the  info  from  MAC  layer  to  do  rate  control  at  net¬ 
work  layer  for  supporting  real-time  and  best  effort  traffic. 
Akan  [4]  propose  a  new  adaptive  transport  layer  suite 
including  adaptive  transport  protocol  and  adaptive  rate 
control  protocol  based  on  the  lower  layer  information. 

However,  cross-layer  design  can  produce  unintended 
interactions  among  protocols,  such  as  an  adaptation 


loops.  It  is  hard  to  characterize  the  interaction  at  different 
layers  and  joint  optimization  across  layers  may  lead  to 
complex  algorithm. 

In  this  paper,  we  discuss  one  of  the  parameters  for 
QOS:  packet  transmission  delay.  And  our  algorithm  is 
quite  different  from  all  the  previous  works.  We  propose 
to  use  the  Fuzzy  Logic  System  (FLS)  for  packet  trans¬ 
mission  delay  analysis  and  prediction.  We  apply  both  a 
singleton  type-1  FLS  and  an  interval  type-2  FLS  for  the 
analysis  and  prediction. 

The  remainder  of  this  paper  is  structured  as  following. 
In  section  II,  we  introduce  the  preliminaries.  In  sec¬ 
tion  III,  we  make  a  overview  of  fuzzy  logic  systems. 
In  section  IV,  we  apply  the  FLS  into  the  cross-layer 
design.  Simulation  results  and  discussions  are  presented 
in  section  V.  In  section  VI,  we  conclude  the  paper. 

II.  Preliminaries 

A.  IEEE  802.11a  OFDM  PHY 

The  physical  layer  is  the  interface  between  the  wire¬ 
less  medium  and  the  MAC  [5],  The  principle  of  OFDM 
is  to  divide  a  high-speed  binary  signal  to  be  transmitted 
over  a  number  of  low  data-rate  subcarriers.  A  key 
feature  of  the  IEEE  802.11a  PHY  is  to  provide  8  PHY 
modes  with  different  modulation  schemes  and  coding 
rates,  making  the  idea  of  link  adaptation  feasible  and 
important. 

B.  IEEE  802.11  MAC 

The  802.11  MAC  uses  Carrier-Sense  Multiple  Access 
with  Collision  Avoidance  (CSMA/CA)  to  achieve  au¬ 
tomatic  medium  sharing  between  compatible  stations. 
In  CSMA/CA,  a  station  senses  the  wireless  medium  to 
determine  if  it  is  idle  before  it  starts  transmission.  If 
the  medium  appears  to  be  idle,  the  transmission  may 
proceed,  else  the  station  will  wait  until  the  end  of 


the  in-progress  transmission.  A  station  will  ensure  that 
the  medium  has  been  idle  for  the  specified  inter-frame 
interval  before  attempting  to  transmit. 

Besides  carrier  sense  and  RTS/CTS  mechanism,  an 
acknowledgment  (ACK)  frame  will  be  sent  by  the  re¬ 
ceiver  upon  successful  reception  of  a  data  frame.  Only 
after  receiving  an  ACK  frame  correctly,  the  transmitter 
assumes  successful  delivery  of  the  corresponding  data 
frame.  The  sequence  for  a  data  transmission  is:  RTS- 
CTS-DATA-ACK. 

A  mobile  node  will  retransmit  the  data  packet  when 
finding  failing  transmission.  Retransmission  of  a  signal 
packet  can  achieve  a  certain  probability  of  delivery. 
There  is  a  relationship  between  the  probability  of  de¬ 
livery  p  and  retransmission  times  n  [6]: 


The  IEEE  802.1 1  standard  requires  that  a  data  frame  is 
discarded  by  the  transmitter’s  MAC  after  certain  number 
of  unsuccessful  transmission  attempts.  According  to  the 
requirement  of  probability  of  delivery,  we  choose  the 
minimum  number  of  retransmission. 

When  MAC  layer  acquires  access  to  the  channel,  the 
nodes  will  exchange  the  RTS-CTS-DATA-ACK  packets. 
After  the  transmitters  receive  an  ACK  packet,  a  packet 
is  transmitted  successfully.  In  this  paper,  we  assume  that 
there  will  be  always  best-effort  traffic  present  that  can 
be  locally  and  rapidly  rate  controlled  in  an  independent 
manner  at  each  node  to  yield  necessary  low  delays  and 
stable  throughputs. 

C.  Bit  Error  Rate 


the  medium  is  sensed  busy  by  each  station  due  to  colli¬ 
sions.  The  MAC  layer  service  time  is  the  time  interval 
from  the  time  instant  that  a  packet  becomes  the  head  of 
the  queue  and  starts  to  contend  for  transmission,  to  the 
time  instant  that  either  the  packet  is  acknowledged  for 
a  successful  transmission  or  the  packet  is  dropped.  This 
time  is  important  when  we  examine  the  performance  of 
higher  protocol  layers. 

E.  Packet  Transmission  Delay 

The  packet  delay  represents  the  time  it  took  to  send 
the  packet  between  the  transmitter  and  the  next-hop 
receiver,  including  the  deferred  time  and  the  time  to 
fully  acknowledge  the  packet.  The  packet  transmission 
delay  between  the  mobile  nodes  includes  three  parts:  the 
wireless  channel  transmission  delay,  the  Physical/MAC 
layer  transmission  delay,  and  the  queuing  delay  [8], 

Defining  D  as  the  distance  between  two  nodes  and 
C  as  the  light  speed,  the  wireless  channel  transmission 
delay  as: 

D 

Delaych  —  —  (2) 

The  Physical/MAC  layer  transmission  delay  will  be 
decided  by  interaction  of  the  transmitter  and  the  receive 
channel,  the  node  density  and  the  node  traffic  intensity 
etc. 

The  queuing  delay  is  decided  by  the  mobile  node  I/O 
system-processing  rate,  the  subqueue  length  in  the  node. 

In  order  to  make  the  system“stable”,  the  rate  at  which 
node  transfers  packets  intended  for  its  destination  must 
satisfy  all  nodes  that  the  queuing  lengths  will  not  be 
infinite  and  the  average  delays  will  be  bounded. 


BER  is  the  percentage  of  bits  with  errors  divided 
by  the  total  number  of  bits  that  have  been  transmitted, 
received  or  processed  over  a  given  time  period.  It  is  a 
measure  of  transmission  quality.  The  high  BER  means 
high  packets  loss  rate.  Requests  for  resends  will  increase 
latency.  For  delay  insensitive  traffic  requiring  a  very  low 
BER. 

D.  MAC  Layer  Service  Time 

There  are  three  basic  processes  when  the  MAC  layer 
transmits  a  packet  [7]:  the  decrement  process  of  the 
backoff  timer,  the  successful  packet  transmission  process 
that  takes  a  time  period  of  Tsuc  and  the  packet  collision 
process  that  takes  a  time  period  of  Tco;.  Here,  Tsuc  is  the 
random  variable  representing  the  period  that  the  medium 
is  sensed  busy  because  of  a  successful  transmission,  and 
Tcoi  is  the  random  variable  representing  the  period  that 


F.  One-step  Markov  Path  Model 

The  mobile  nodes  are  roaming  independently  with 
variable  ground  speed.  The  mobility  model  is  called  one- 
step  Markov  path  model  [9].  The  probability  of  moving 
in  the  same  direction  as  the  previous  move  is  higher 
than  other  directions  in  this  model,  which  means  this 
model  has  memory.  Fig.l  shows  the  probability  of  the 
six  directions. 

III.  Overview  of  Interval  Type-2  Fuzzy  Logic 
Systems 

Figure  2  shows  the  structure  of  a  type-2  FLS  [10],  It 
is  very  similar  to  the  structure  of  a  type-1  FLS  [1 1].  For 
a  type-1  FLS,  the  output  processing  block  only  contains 
the  defuzzifier.  We  assume  that  the  reader  is  familiar 
with  type-1  FLSs,  so  that  here  we  focus  only  on  the 
similarities  and  differences  between  the  two  FLSs. 


Proh-0.5 


Prob^l 


Fig.  1.  One-step  Markov  Path  Model 


TYPE-2  FUZZY  LOGIC  SYSTEM 


Fig.  2.  The  structure  of  a  type-2  FLS.  In  order  to  emphasize  the 
importance  of  the  type-reduced  set,  we  have  shown  two  outputs  for 
the  type-2  FLS,  the  type-reduced  set  and  the  crisp  defuzzified  value. 

The  fuzzifier  maps  the  crisp  input  into  a  fuzzy  set. 
This  fuzzy  set  can,  in  general,  be  a  type-2  set. 

In  the  type-1  case,  we  generally  have  “IF-THEN” 
rules,  where  the  Zth  rule  has  the  form  “Rl  :  IF  xi  is  F^ 
and  X2  is  Fj  and  •  •  ■  and  xp  is  Flp,  THEN  y  is  G;”,  where: 
XiS  are  inputs;  F*s  are  antecedent  sets  (i  =  1, . . .  ,p);  y  is 
the  output;  and  G;s  are  consequent  sets.  The  distinction 
between  type-1  and  type-2  is  associated  with  the  nature 
of  the  membership  functions,  which  is  not  important 
while  forming  rules;  hence,  the  structure  of  the  mles 
remains  exactly  the  same  in  the  type-2  case,  the  only 
difference  being  that  now  some  or  all  of  the  sets  involved 
are  of  type-2;  so,  the  Zth  rale  in  a  type-2  FLS  has  the 
form  “j Rl  :  IF  x\  is  F^  and  X2  is  F^  and  ■  ■  ■  and  xp  is 
Fp,  THEN  y  is  G;”. 

In  the  type-2  case,  the  inference  process  is  very 
similar  to  that  in  type-1.  The  inference  engine  combines 
rales  and  gives  a  mapping  from  input  type-2  fuzzy  sets 
to  output  type-2  fuzzy  sets.  To  do  this,  one  needs  to 
find  unions  and  intersections  of  type-2  sets,  as  well  as 
compositions  of  type-2  relations. 

In  a  type-1  FLS,  the  defuzzifier  produces  a  crisp 
output  from  the  fuzzy  set  that  is  the  output  of  the 
inference  engine,  i.e.,  a  type-0  (crisp)  output  is  obtained 
from  a  type-1  set.  In  the  type-2  case,  the  output  of 
the  inference  engine  is  a  type-2  set;  so,  “extended 
versions”  (using  Zadeh’s  Extension  Principle  [12])  of 
type-1  defuzzification  methods  was  developed  in  [10]. 


This  extended  defuzzification  gives  a  type-1  fuzzy  set. 

Since  this  operation  takes  us  from  the  type-2  output  sets 
of  the  FLS  to  a  type-1  set,  this  operation  was  called 
“type-reduction”  and  the  type-reduced  set  so  obtained 
was  called  a  “type-reduced  set”  [10].  To  obtain  a  crisp 
output  from  a  type-2  FLS,  we  can  defuzzify  the  type- 
reduced  set. 

General  type-2  FLSs  are  computationally  intensive, 
because  type-reduction  is  very  intensive.  Things  simplify 
a  lot  when  secondary  membership  functions  (MFs)  are 
interval  sets  (in  this  case,  the  secondary  memberships  are 
either  0  or  1).  When  the  secondary  MFs  are  interval  sets, 
the  type-2  FLSs  were  called  “interval  type-2  FLSs”.  In 
[13],  Liang  and  Mendel  proposed  the  theory  and  design 
of  interval  type-2  fuzzy  logic  systems  (FLSs).  They  pro¬ 
posed  an  efficient  and  simplified  method  to  compute  the 
input  and  antecedent  operations  for  interval  type-2  FLSs, 
one  that  is  based  on  a  general  inference  formula  for 
them.  They  introduced  the  concept  of  upper  and  lower 
membership  functions  (MFs)  and  illustrate  their  efficient 
inference  method  for  the  case  of  Gaussian  primary  MFs. 

They  also  proposed  a  method  for  designing  an  interval 
type-2  FLS  in  which  they  tuned  its  parameters. 

In  an  interval  type-2  FLS  with  singleton  fuzzification 
and  meet  under  minimum  or  product  f-norm,  the  result 
of  the  input  and  antecedent  operations,  F*,  is  an  interval 
type-1  set,  i.e.,  Fl  =  [/*,/],  where  f  and  /  simplify 
to 

f  =  Ms'  (*i)  *  •  •  •  *  Me'  (XP)  '  (3) 

1  P 

and 

Jl  =  -pfi(xi)*...*JIfi(xp)  (4) 

1  V 

where  x,  (i  =  1  denotes  the  location  of  the 

singleton. 

In  this  paper,  we  use  center-of-sets  type-reduction, 
which  can  be  expressed  as: 

YUY1,---  ,FM)  =  [m,yr]=  [  ...  [  [  ■ 

Jy 1  JyM  Jf1 

(5) 

where  Fcos  is  an  interval  set  determined  by  two  end 
points,  yi  and  yr;  f  e  F*  =  {f,T}\  yi  eYi  =  [y\,y% 
and  Yl  is  the  centroid  of  the  type-2  interval  consequent 
set  G  ;  and,  i  —  1, . . . ,  M.  Because  Ycos  is  an  interval 
set,  we  defuzzify  it  using  the  average  of  yi  and  yr;  hence, 
the  defuzzified  output  of  an  interval  type-2  FLS  is 

/(*)  -  si*  w 


IV.  Modeling  BER  and  MAC  Layer  service 
time  with  Gaussian  Membership  Function 
A.  Analyzing  and  Modelling  BER 

Let  p  be  the  probability  that  bit  is  error  in  any  given 
time.  So  p  can  be  described  as  a  random  variable  with 
a  known  mean  value  Ea. 

Now,  at  any  given  time  the  bit  is  error  with  probability 
p  and  the  bit  is  correct  with  probability  1-p.  Since  the 
bit  is  either  error  or  correct,  the  number  of  the  bits  it  is 
error(f?b)  for  a  fixed  length  transmission  bits  is  binomial 
random  variable.  The  length  of  the  transmission  bits  is 
Nt,  The  probability  that  Eb  takes  any  value  x  is  : 

P{Eb  =  x}  =  C^px(\-p)N‘-x  (7) 

As  the  number  of  the  length  of  the  transmission  bits 
increase,  the  binomial  distribution  is  approximated  to  a 
normal  distribution,  with  mean  p  =  pNt  and  variance  <r2 
=  p(l-p)Nt. 

In  this  paper,  we  set  up  fine  membership  func- 
tions(MFs)  for  BER.  From  the  original  data  of  BER 
shown  in  Table  I,  we  decomposed  the  whole  data  sets 
into  ten  segments  and  computed  the  mean  m*  and  std  at 
of  the  BER  of  the  ith  segment,  i  =  1, 2,  •  •  •  ,10.  We  also 
computed  the  mean  m  and  std  a  of  the  entire  BER.  To 
see  which  value  -m*  or  a-  varies  more,  we  normalized 
the  mean  and  std  of  each  segment  using  mj/m,  and 
o-j/er,  and  we  then  computed  the  std  of  their  normalized 
values,  <rm  and  astd. 

TABLE  I 

Mean  and  std  values  for  ten  segments  and  the  entire 
BER,  and  their  normalized  std. 


BER 

std 

Segment  1 

Segment  2 

0.015618 

0.027857 

Segment  3 

0.015528 

0.017401 

Segment  4 

0.016206 

0.02107 

Segment  5 

0.015721 

0.017148 

Segment  6 

0.016298 

0.029309 

Segment  7 

0.017062 

0.037428 

Segment  8 

0.016253 

0.022871 

Segment  9 

0.016448 

0.023194 

Segment  10 

Entire  Traffic 

0.016198 

0.025829 

Normalized  std 

0.029161 

0.26184 

As  we  see  from  the  last  row  of  Tables  I,  am  <C  astd- 
We  conclude,  therefore,  that  if  the  BER  of  each  segment 
(short  range)are  Gaussian  with  uncertain  standard  devia¬ 
tion.  One  example  of  type-2  Gaussian  MF  with  uncertain 
standard  deviation  is  shown  in  Fig.3. 


Fig.  3.  Type-2  Gaussian  MF  with  uncertain  standard  deviation 

B.  Analyzing  and  Modelling  MAC  Layer  Service  Time 
Recent  research  by  Zhai,  kwon  and  Fang  [7]  discov¬ 
ered  that  the  lognormal  distribution  could  match  for  the 
MAC  layer  service  time,  i.e.,  if  the  MAC  layer  service 
time  for  the  packet  i  is  s,,  then 

logio  (8) 

We,  therefore,  tried  to  model  the  logarithm  of  the 
MAC  layer  service  time,  to  see  if  a  Gaussian  MF  can 
match  its  nature.  We  decomposed  the  whole  data  sets 
into  ten  segments  and  computed  the  mean  mi  and  std 
of  the  logarithm  of  the  MAC  layer  service  time  of 
the  ith  segment,  i  =  1, 2,  ■  •  •  ,  10.  We  also  computed  the 
mean  m  and  std  er  of  the  entire  logarithm  of  the  MAC 
layer  service  time.  To  see  which  value  -mi  or  cr*—  varies 
more,  we  normalized  the  mean  and  std  of  each  segment 
using  mi/m,  and  Oi/o,  and  we  then  computed  the  std 
of  their  normalized  values,  am  and  crstd. 

TABLE  II 

Mean  and  std  values  for  ten  segments  and  the  entire 

LOGARITHM  OF  MAC  LAYER  SERVICE  TIME,  AND  THEIR 
NORMALIZED  STD. 


MAC  layer  service  time 

mean 

std 

Segment  1 

-1.1902 

0.44295 

Segment  2 

-1.1929 

0.44698 

Segment  3 

-1.1967 

0.45237 

Segment  4 

-1.1959 

0.44835 

Segment  5 

-1.1917 

0.43598 

Segment  6 

-1.1924 

0.44779 

Segment  7 

-1.1976 

0.45687 

Segment  8 

-1.1996 

0.45554 

Segment  9 

-1.1923 

0.45068 

Segment  10 

-1.1997 

0.462 

Entire  Traffic 

-1.1949 

0.44981 

Normalized  std 

0.0028746 

0.016421 

As  we  see  from  the  last  row  of  Tables  II,  orn  -C  astd- 
We  conclude,  therefore,  that  if  the  logarithm  of  MAC 


layer  service  time  of  each  segment  (short  range)are 
Gaussian  with  uncertain  standard  deviation,  as  shown  in 
Fig.3. 

V.  Cross-layer  Design  Using  Interval  Type-2 
Fuzzy  Logic  System 

As  we  introduce  in  the  preliminaries,  the  high  BER 
means  high  packets  loss  rate.  Requests  for  resends  will 
increase  latency.  For  delay  insensitive  traffic  requires 
a  very  low  BER.  And  the  MAC  later  service  time  is 
important  when  we  examine  the  performance  of  higher 
protocol  layers.  So  we  could  know  BER  and  MAC  layer 
service  time  will  manage  the  packet  transmission  delay 
between  the  mobile  nodes.  We  are  now  ready  to  evaluate 
the  packet  transmission  delay  using  interval  type-2  fuzzy 
logic  systems. 

We  predict  packet  transmission  delay  based  on  the 
following  two  antecedents: 

1)  Antecedent  1.  BER. 

2)  Antecedent  2.  MAC  layer  service  time. 

The  consequent  is  depicted  as  the  packet  transmission 
delay.  The  linguistic  variables  used  to  represent  the  BER 
and  MAC  layer  service  time  were  divided  into  three 
levels:  low,  moderate,  and  high.  The  consequents  -  the 
packet  transmission  delay  were  divided  into  5  levels,  vert 
low,  low,  moderate,  high  and  very  high. 

We  designed  questions  such  as: 

IF  BER  is  low  and  MAC  layer  service  time  is  high, 
THEN  the  packet  transmission  delay  is 


So  we  need  to  set  up  32  =  9  (because  every  antecedent 
has  3  fuzzy  sub-sets,  and  there  are  2  antecedents)  rules 
for  this  FLS.  We  summarized  these  rules  in  Table  II. 


TABLE  III 

Fuzzy  Rules  and  Consequent 


Antecedent 1 

Antecedent2 

Consequent 

Low 

Low 

Very  Low 

Low 

M oder ate 

Low 

Low 

High 

Moderate 

Moderate 

Low 

Low 

Moderate 

Moderate 

Moderate 

Moderate 

High 

High 

High 

Low 

Moderate 

High 

Moderate 

High 

High 

High 

VeryHigh 

We  used  Guassian  membership  functions  (MFs)  to 
represent  the  antecedents  and  the  consequent. 


Fig.4  show  the  FLS  application  for  the  cross-layer 
design. 


Fig.  4.  FLS  application  for  cross-layer  design 

When  a  mobile  node  sends  out  a  packet,  it  will  first 
predict  the  packet  transmission  delay  using  the  FLS 
algorithm.  After  that,  the  node  could  choose  to  send  the 
real-time  service  or  not  for  the  real-time  service  need 
low  delay  requirement. 

VI.  Simulations 

We  implemented  the  simulation  model  using  the  OP- 
NET  modeler.  The  simulation  region  is  300x300  meters. 
There  were  12  mobile  nodes  in  the  simulation  model, 
and  the  nodes  were  roaming  independently  with  variable 
ground  speed  between  0  to  10  meters  per  second.  The 
mobility  model  was  called  one-step  Markov  path  model. 
The  movement  would  change  the  distance  between  mo¬ 
bile  nodes.  We  assumed  the  collecting  data  distribution 
of  the  mobile  node  was  exponential  distribution  and  the 
arriving  interval  was  0.2  second  and  the  length  of  the 
packet  is  512  bits. 

Because  data  communications  in  the  mobile  networks 
had  trimming  constraints,  it  was  important  to  design 
the  network  algorithm  to  meet  a  kind  of  end-end  dead¬ 
line  [14],  We  used  the  packet  transmission  delay  to 
evaluate  the  network  performance. 

Each  packet  was  labeled  a  timestamp  when  it  was 
generated  by  the  source  mobile  node.  When  its  destina¬ 
tion  mobile  node  received  it,  the  time  interval  was  the 
transmission  delay. 

For  type-1  FLS,  We  chose  Gaussian  membership 
function  as  antecedents;  for  interval  type-2  FLS,  we  used 
Gaussian  primary  MF’s  with  fixed  mean  and  uncertain 
std  for  the  antecedents.  The  steepest  decent  algorithm 
was  used  to  train  all  the  parameters  based  on  the  300 


Fig.  5.  The  RMSE  of  packet  transmission  delay  prediction  for  two 
FLS  approaches 


data  sets.  After  training,  the  rules  were  fixed,  and  we 
tested  the  FLS  based  on  the  remaining  300  data  sets. 

In  Fig.5,  we  summarized  the  root-mean-square-errors 
(RMSE)  between  the  estimated  packet  transmission  de¬ 
lay  and  the  actual  delay. 


RMSE  = 


\ 


1  600 

300 


i=301 


(9) 


where  d(i)  was  the  actual  packet  transmission  delay  and 
f(i)  was  the  estimated  delay. 

The  simulation  result  shows  that  the  interval  type-2 
FLS  for  packet  transmission  delay  analysis  and  predic¬ 
tion  outforms  the  type-1  FLS. 

We  introduce  the  fuzzy  logic  system  in  the  cross-layer 
design.  Compare  with  other  algorithms  for  cross-layer 
design,  the  fuzzy  method  could  be  flexible  and  simpler 
to  implement.  We  could  predict  the  packet  transmission 
delay  according  to  the  information  just  from  physical 
layer  and  mac  layer.  So  we  have  potential  application 
advantage.  We  could  estimate  the  packet  transmission 
delay  before  the  mobile  node  sends  a  packet.  Therefore 
we  could  assure  the  service  meet  the  end-to-end  delay 
deadline. 


VII.  Conclusion 

Cross-layer  design  is  a  effective  method  to  improve 
the  performance  of  the  mobile  ad  hoc  network.  We  apply 
the  fuzzy  logic  system  to  combine  physical  layer  and 
data-link  layer  together.  We  select  BER  and  MAC  layer 
service  time  as  antecedents  to  analyze  and  predict  the 
packet  transmission  delay.  And  we  apply  a  type-1  FLS 
and  an  interval  type-2  FLS  for  the  packet  transmission 


delay  analysis  and  prediction.  Simulation  result  shows 
that  the  interval  type-2  FLS  for  packet  transmission  delay 
analysis  and  prediction  outform  the  type-1  FLS. 
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Abstract — In  this  paper,  we  address  a  fundamental 
problem  in  Wireless  Sensor  Networks,  how  many  hops 
does  it  take  for  a  packet  to  be  relayed  for  a  given  distance? 
For  a  deterministic  topology,  this  question  reduces  to  a 
simple  geometry  problem.  However,  a  statistical  study 
is  needed  for  randomly  deployed  WSNs.  We  propose  a 
Maximum  Likelihood  decision  based  on  the  joint  pdf  of 
(H,n),  which  is  also  derived  in  this  paper.  Since  the 
solution  is  not  closed-form,  we  also  propose  an  attenuated 
Gaussian  approximation  for  the  joint  pdf.  We  show  that  the 
approximation  visibly  simplifies  the  decision  process  and 
the  error  analysis.  The  latency  and  energy  consumption 
estimation  are  also  included  as  application  examples. 

I.  Introduction 

The  recent  advances  in  MEMS,  embedded  systems 
(  and  wireless  communications  enable  the  realization  and 
deployment  of  wireless  sensor  networks  (WSN),  which 
consist  of  a  large  number  of  densely  deployed  and  self- 
organized  sensor  nodes.  The  potential  applications  of 
WSN,  such  as  environment  monitor,  often  emphasize 
the  importance  of  location  information.  Accordingly 
geographic  routing  [1]  was  proposed  to  handle  such 
requirement.  Most  likely,  a  packet  is  not  routed  to 
a  specific  node,  but  a  given  location.  An  interesting 
question  arises  as  “how  many  hops  does  it  take  to  reach  a 
given  location?”  The  prediction  of  the  number  of  hops  is 
important  not  only  in  itself  but  also  in  helping  estimating 
the  latency  and  energy  cost,  which  are  both  important  to 
the  viability  of  WSN. 

The  question  could  become  very  simple  if  the  sensor 
nodes  are  manually  placed.  For  example,  suppose  sensor 
nodes  are  place  in  a  square  grid  with  separation  of  d.  Ob¬ 
viously,  the  connectivity  depends  on  the  comparison  of 
d  and  the  transmission  range  R.  Suppose  d  <  R  <  \f2 d, 
this  is  simply  a  4-connectivity  network.  For  any  node, 


the  possible  distance  of  its  first-hop  neighbors  is  {d}, 
the  possible  distances  of  its  second-hop  neighbors  are 
{-\/2d,  2d}  and  so  on.  Generally,  the  possible  distances 
of  its  nth-hop  neighbors  are  { \/(n  -  i)2  +  i2d,  i  = 
0, 1,2,  •••  ,[n/2]},  where  [n/2]  is  the  smallest  integer 
not  less  than  n/2.  If  we  compare  the  given  distance 
with  these  distances,  the  required  number  of  hops  can  be 
easily  found.  For  some  given  distance,  there  could  be  two 
solutions,  such  as  (8-l)2  +  l2  =  (10-5)2  +  (10-5)2  = 
50,  then  we  have  to  select  the  number  of  hops  with  higher 
probability.  For  geographic  approach,  such  conflicts  can 
be  easily  solved  with  loss  of  accuracy.  Thus,  geographic 
approach  is  more  efficient  and  accurate  than  statistical 
approach  on  deterministic  topology. 


Fig.  1.  The  nodes  in  a  square  grid  placement.  Only  nodes  within  4 
hops  are  shown. 

However,  if  sensor  nodes  are  deployed  in  a  random 
fashion,  which  is  the  case  for  most  potential  application, 
the  answer  is  beyond  the  reach  of  simple  geometry. 
The  stochastic  nature  of  the  random  deployment  calls 
for  a  statistical  study.  A  natural  and  obvious  estimation 
would  be  dividing  the  distance  by  the  average  inter- 


node  distance  (i.e.,  the  average  single-hop  distance). 
However,  such  estimation  may  be  unable  to  provide 
the  required  accuracy.  A  probabilistic  study  is  needed 
here,  that  is,  finding  f(H\d),  where  H  is  the  number  of 
hops.  Although  the  question  raised  here  is  not  directly 
addressed  before,  a  mirror  problem,  finding  f(d\h),  has 
been  well  studied.  In  [2],  Hou  and  Li  studied  the  2- 
D  Poisson  distribution  to  find  a  optimal  transmission 
range.  They  found  that  the  hop-distance  distribution  is 
determined  not  only  by  node  density  and  transmission 
range  but  also  by  the  routing  strategy.  They  showed 
results  for  three  routing  strategies,  Most  Forward  with 
Fixed  Radius,  Nearest  with  Forward  Progress,  and  Most 
Forward  with  Variable  Radius.  Cheng  and  Robertazzi  in 
[3]  studied  the  one-dimension  Poisson  point  and  found 
the  pdf  of  rt  as 

Ae-A(ft-r<) 

where  R  is  the  transmission  range,  A  is  the  node  density, 
Ti  is  the  distance  from  the  source  to  a  ith-hop  point  and 
ri  is  related  to  re.  by 


I'd  +Ti~  R. 


The  pdf  of  rei  is  also  obtained, 


/r«;  (reJ 


Ae-Ar'i 

1  _  ■ 


(2) 

(3) 


or  sample  set,  is  symmetric  if  it  looks  the  same  to  the 
left  and  right  of  the  center  point. 

Definition  1:  [5]  For  a  given  sample  set  X, 

m3  =  Z(X-Xf/n,  (4) 

m2  -  E(AT  -  X)2/n,  (5) 


where  X  is  the  sample  mean  of  X,  and  n  is  the  size 
of  X.  Then  a  sample  estimate  of  skewness  coefficient  is 
given  by 

m3 

9i  =  —?■  (6) 


ml 

Skewness  is  zero  for  a  symmetric  distribution.  Positive 
skewness  indicates  right  skewness  and  negative  indicates 
left. 

Kurtosis  is  a  measure  of  whether  the  data  are  peaked 
or  flat  relative  to  a  normal  distribution. 

Definition  2:  [5]  A  sample  estimate  of  kurtosis  for  a 
sample  set  X  is  given  by 


92  =  m4/ml  -  3, 


(7) 


where  m 4  =  E(X  -  X)4/n  is  the  fourth-order  moment 
of  X  about  its  mean. 

Skewness  and  kurtosis  is  useful  in  determining 
whether  a  sample  set  is  normal.  Note  that  the  skewness 
and  kurtosis  of  a  normal  distribution  are  both  zero; 
significant  skewness  and  kurtosis  clearly  indicate  that 
data  are  not  normal. 


Obviously,  the  distribution  of  r*  depends  on  previous 
rj,  j  <  i.  They  also  pointed  out  the  2-D  Poisson  point 
distribution  is  analogous  to  the  1-D  case,  replacing  the 
length  of  the  segment  by  the  area  of  the  range. 

Vural  and  Ekici  reexamined  the  study  under  the  sensor 
networks  circumstances  in  [4],  and  gave  the  mean  and 
variance  of  multi-hop  distance.  They  also  proposed  to 
approximate  the  multi-hop  distance  using  Gaussian. 

The  rest  of  this  paper  is  organized  as  follows.  We 
provide  some  preliminaries  on  skewness  and  kurtosis  in 
Sectionsect:Preliminaries.  The  number  of  hops  predica¬ 
tion  problem  is  addressed  and  solved  in  Section  III.  Since 
this  problem  has  no  closed-form  solution,  we  propose 
an  attenuated  Gaussian  approximation  and  show  how  to 
simplify  the  error  analysis  in  Section  IV.  An  application 
example  is  shown  in  Section  V.  Section  VI  concludes 
this  paper. 

II.  Preliminaries  :Skewness  and  Kurtosis 

In  this  section,  we  provide  some  preliminaries  on  sta¬ 
tistical  methods  [5],  Skewness  is  a  measure  of  symmetry, 
or  more  precisely,  the  lack  of  symmetry.  A  distribution. 


III.  The  Number  of  Hops  Prection 
A.  Problem  Formulation 

We  make  the  following  assumptions. 

•  The  nodes  are  deployed  at  random  on  a  plan,  that 
is,  the  node  distribution  follows  2D  Poisson  random 
process.  Thus,  the  probability  of  “there  is  no  node 
in  a  given  area  A ”  is  given  by  [6] 

Pr  (No  nodes  in  A)  =  e~XA,  (8) 

where  A  is  the  density  of  nodes. 

•  The  distance  from  the  source  to  the  destination  d  is 
known,  which  is  common  in  geographic  routing. 

•  Neither  of  the  source  and  destination  is  close  to  the 
border.  This  assumption  holds  true  for  most  of  the 
nodes  if  the  network  size  is  large  enough. 

The  problem  of  interest  is  to  find  the  number  of  hops, 
denoted  H  needed  to  reach  a  specific  destination  r  from  a 
given  source  node.  We  can  make  a  Maximum  Likelihood 
(ML)  decision, 

H  =  aigf(H\r),H  =  1,2,3,  ••  •  .  (9) 

max 


2 


Considering 

nm = o°) 

the  decision  rule  can  be  translated  into 

H  —  arg/(iT,r),  (11) 

max 

where  f(H,  r )  is  also  called  objective  function.  In  the 
next  subsection,  we  are  concerned  with  deriving  p(H,  r). 


90 


270 


Fig.  2.  Poisson  node  distribution. 


B.  Derivation  of  the  Joint  PDF  p(H ,  r) 

Let  r  denote  the  distance  from  the  source  to  a  node, 
the  cdf  of  d  is 


Fr( r)  =  1  -  e-A,rr2. 

(12) 

And  the  pdf  of  d  is 

fr{r)  =  X2'Kre  A7rr2. 

(13) 

When  H  =  1,  the  joint  cdf  of  (H,  ri) 

r  1  _  e-\n r\ 

ri  <  R 

(14) 

p(H  =  l,n)  =  l 

r\  >  R 

and  the  joint  pdf  is 

.  .  fA27rrie  A7rri 

r\  <  R 

(15) 

/(iT  =  i,n)  =  |o 

ri  >  R 

Note  that  the  conditional  pdf  of  H  = 

=  1  given  r 

<  R 

is  unity  for  r  <  R,  which  is  intuitively  correct  but 
simple,  we  are  more  interested  in  multi-hop  distance. 
In  the  following,  r  >  R  is  assumed  so  that  H  >  1. 

The  two-hop  case  is  shown  in  Fig.  3,  a  second-hop 
node  must  satisfy  R  <  r2  <  2 R.  Furthermore,  the 


Fig.  3.  The  second-hop  coverage. 


farthest  first-hop  node  is  not  necessarily  at  the  maximum 
transmission  range,  which  means,  there  is  a  gap  rei 
between  R  and  r\,  i.e., 

rei+n  =  R.  (16) 

Therefore,  the  joint  pdf  of  (f?i,rei)  is 

,  \  /A27T (R-rei)e~x<R-r^  rei  <  R 

HH  =  l,r„)  =  j0  ow 

(17) 


And  accordingly,  the  joint  cdf  of  is 

p{H  =  l,r2|n)  = 

Je-^[(it+ri)2-(r2+r,)2]  _  e-A7r[(it+n)2-r?]  R  <  r  <  2R^^ 

S  *  rr r  ' 


Generally,  for  H  -  n  (shown  in  Fig.4),  we  have 
p(Hn,rn\ri,r2,  -  1)  = 
f  -ArrKft+EVd’-CEn)2]  -A7r[(it+"Eri)2-("Eri] 


i=l  t=l 


R  <  r  <  nR 
O.W. 


piflni  rn)  — 

{n—l)R(n—2)R  R 

R  R  0 

f(R-n— l>Gi— l|ri,T*2,  iTn  2)  *  •  • 

f(Hi,ri)dn  ■  ■  ■  drn-2drn-\  (20) 
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Theoretically,  we  can  take  derivative  of  (20)  with 
respect  to  r  to  obtain  the  objective  function,  use  (11)  to 
decide  the  most  likely  H  given  r  and  give  the  probability 
of  error  for  such  a  decision.  However,  (20)  is  awkward 
to  evaluate  and  the  computational  cost  could  limit  the 
applicability  of  such  a  decision  scheme. 

IV.  Attenuated  Gaussian  Approximation 


Fig.  5.  Histograms  of  hop-distance  joint  distribution.  ( R  —  30, 
A  =  6.37(10)-3) 

approach  the  normal  when  H  increases.  Table  I  lists 
the  first-,  second-,  third-  and  fourth-order  statistics  of 
f(H,r).  The  skewness  and  kurtosis  clearly  satisfy  the 
Gaussianity  condition  within  tolerance  of  error.  Thus, 
the  objective  function  can  be  approximated  by 


TABLE  I 

Statistics  of  f(H  =  n,  r„),  n  >  3 


Number  of  Hops 

Mean 

Std 

Skewness 

Kurtosis 

1 

19.991 

7.0651 

-0.57471 

-0.58389 

2 

45.132 

7.8365 

-0.16958 

-1.0763 

3 

72.01 

8.2129 

-0.10761 

-1.0332 

4 

99.45 

8.391 

-0.07938 

-0.97857 

5 

127.14 

8.5323 

-0.06445 

-0.93104 

6 

154.96 

8.6147 

-0.05341 

-0.9004 

7 

182.68 

8.573 

-0.07738 

-0.91687 

Since  (20)  is  awkward  to  evaluate  even  using  numer¬ 
ical  methods,  we  use  histograms  collected  from  Monte 
Carlo  simulations  as  substitute  to  the  joint  pdf.  All  the 
simulation  data  are  collected  from  such  a  scenario  that 
N  sensor  nodes  were  uniformly  distributed  in  a  circular 
region  of  radius  of  300  meters.  For  convenience,  polar 
coordinates  were  used.  The  source  node  was  placed  at 
(0,0).  The  transmission  range  was  set  as  R  meters.  For 
each  setting  of  (N,  R),  we  ran  300  simulations,  in  each 
of  which  all  nodes  are  re-deployed  at  random.  And  the 
node  density  is  given  by 

N 

A  =  ^  (2.) 

The  histograms  of  f(H,r )  are  plotted  in  Fig.  5, 
which  clearly  shows  that  the  joint  distribution  of  (H,  r) 


f(H  =  n,rn) 


anN{mn,on) 


(22) 


where  a  is  the  equivalent  attenuation  base,  mn  and  an 
are  the  mean  and  standard  deviation(std),  respectively. 
The  specific  values  of  these  parameters  can  be  evaluated 
from  (20)  numerically  or  estimated  from  simulations. 
Observe  Table  I,  for  large  n,  the  joint  pdf  of  ( H ,  r)  has 
following  properties, 

1)  cr„  «  <jn- 1,  which  means  the  neighboring  joint 
pdf’s  have  similar  spread. 

2)  mn  —  mn-i  «  mn+i  —  mn,  which  means  the  joint 
pdf’s  are  evenly  spaced. 

3)  3  <  — <  5,  which  means  the  overlap 
between  the  neighboring  joint  pdf’s  is  small  but 
not  negligible.  (As  a  rule  of  thumbs,  Q(3)  is 
considered  relatively  small  and  Q(5)  is  regarded 
negligible.) 

4)  5,  which  means  the  overlap  between 
the  non-neighboring  joint  pdf’s  is  negligible. 

5)  a  <  1.  For  large  density  A,  a  — *•  1.  Along  with 
Property  1,  this  tell  us  that  the  neighboring  joint 
pdf’s  have  nearly  identical  shape. 

As  shown  in  the  following  discussion,  these  properties 
largely  simplify  the  decision  rule  and  the  error  analysis. 
Another  interesting  observation,  besides  these  properties, 
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is  that  the  following  equations  do  not  stand  true. 


mn 

=  nm\ 

(23) 

mn 

=  nR 

(24) 

mn 

—  (ti  —  R)R  +-  Rj  2 

(25) 

Although  these  equations  sound  plausible,  they  all  give 
visible  errors.  The  aforementioned  estimator  [r/R]  +  1 
for  H,  though  widely  used,  is  not  good  in  the  new  light 
shed  by  this  study.  However,  Property  2  does  tell  us  the 
increment  for  mn  is  constant,  if  denoted  by  A, 

mn  =  mi  +  (n  —  1)A  (26) 

We  showed  in  [7]  that  mi  =  2/371,  irrelevant  to  the  node 
density.  Although  A  is  a  function  of  A  and  R,  A  is  often 
regarded  constant  for  a  specific  application  and  R  varies 
in  a  short  range,  thus,  we  can  safely  expect  A  =  aR, 
where  a  is  a  constant,  for  example,  a  =  0.9  for  the  data 
in  Table  I.  In  summary,  the  following  empirical  equation 
stands  for  most  application  for  WSN. 

mn  =  R(^  +  (n  -  l)a)  (27) 

The  above  results  about  the  constant  increment  of  mean 
hop-distance  is  used  in  Section  V-B  for  energy  consump¬ 
tion  estimation. 

A.  Decision  Boundaries 


Fig.  6.  Gaussian  Approximation. 

Following  (1 1),  we  decide  H  given  r  using  the  fol¬ 
lowing  rule. 

H  =  argmaxf(H,  r)  (28) 

Observe  the  /(77„,rn )  in  Fig.  6,  the  decision  is  needed 
only  between  neighboring  H,  that  is, 

f(H  =  n,r)  i  f(H  =  n  +  l,r).  (29) 

n+l 

This  is  because,  for  a  specific  value  of  r,  there  are  only 
two  Hn  with  dominating  f(H  =  n  +  l,r),  compared 
to  which  f(H  —  n  +  l,r)  for  other  values  of  Hn 


is  negligible.  Substitute  (22)  into  (29),  we  obtain  the 
decision  boundary  dn  between  the  regions  H  =  n  and 
H  =  n  +  l. 

J  B  +  y/ B 2  +  AC) 

dn  -  j 

A  =  °l+ i~°l 

B  =  mnal+1  -  mn+ia2n 

C  =  m2nol+1  -  m2n+1ol  +  2 a2na2n+1  In  a  (30) 


Using  Property  1, 


dn  — 


m; 


'n+l  ~  mn 


2 In 


a 


2(mn+i  mn) 


(31) 


For  large  density  A,  Property  5  is  applicable,  (30)  sim¬ 
plifies  to 


drt 


°nmn+l  +  cr2+1mn 

al  +  o£u 


Applying  Property  1  to  (32), 

mn  +  mn+ 1 
dn  = - 2 - 

If  we  use  the  empirical  equation  (27), 

2  1 
dn  =  -R  +  (n-  ~)aR 

No  matter  which  approximate  solution  we  choose  for  dn, 
the  decision  rule  is  given  by 


(32) 


(33) 


(34) 


n+l 

r  ^  d„. 

n 


(35) 


In  other  words, 


we  decide  77  =  n  if  dn-i  <  r  <  d^,  (36) 


which  is  equivalent  to 

"  =  l:#  +  5l  +  1-  (37> 

B.  Error  Performance  Analysis 

For  out  decision  rule,  a  decision  error  occurs  when 
H  =  nf  h.  Thus,  the  probability  of  error  with  a  specific 
r  is 

P(e,r)  =  ^/(n,r).  (38) 

n^ft 

The  total  probability  of  error  is  obtained  by  integrating 
(38)  over  all  possible  r. 

7>(e)  =  J p(e,r)dr  (39) 
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According  to  Property  4,  only  f(n  —  1,  r)  and  f(n+ 1,  r) 
could  have  outstanding  value  over  the  decision  region 
[dn—  i  j  dj . 


P(c) 


oo  ar\ 

E  /  /(n 

n= 2/ 


l,r)  +  /(n  +  l,r)dr 


Fig.  7.  Time  Model. 


X)aw~1[Q(<f"~1^  mn~1)  -  Q(dn~mra-1)] 


n=2 


1 


0n— 1 


+an+1[Q(rn”+1  dn)  -  Q(— t1- ft*"1)] 

°Vi+l  0n+l 

(40) 


Note  that 


dn  Wln—  1  ^n—1  Wln—  1 

0*n-l  ^n—1 

=  »  1,  (41) 

0JI— 1 

therefore,  Q(dn^m"  -1-)  is  negligible  compared  to 

Q^al-T*-1).  Similarly,  is  negligible. 

(40)  is  approximated  by 

p(e)  *  a3g(!^zj2 )  +  g[Qn-ig(,^-i  -  ) 

+an+1Q(mn+1~dn)] 

0n+l 


071—1 


2^./ d-2  -  m2 

=  a  Q( 


02 


)+E«n[Q( 


,nr  ntmn  dn-1 


n~  3 


+Q(— -  — )] 


(42) 


is  negligible.  Shown  in  Fig.  7,  given  the  end-to-end 
distance  r,  we  can  find  the  required  number  of  hops 
H  =  n  according  to  (35),  thus,  a  good  estimator  of  the 
total  latency  of  a  Z-bit  message  is 

l[Ttx  +  {n  —  1  ){Ttx  +  Trx)  +  Trx\  (44) 
=  ln(Ttx  +  Trx)  (45) 


B.  Energy  Consumption  Estimation 


The  following  model  is  adopted  from  [8]  where  per¬ 
fect  power  control  is  assumed.  To  transmit  l  bits  over 
distance  d,  the  sender’s  radio  expends 


Etx(l,d)  = 


_  j lEeiec  +  hfs(fi  d  <  do 
t^elec  Rmpd  d  ^  do 


(46) 


and  the  receiver’s  radio  expends 


Erx(l,d )  — lEeiec.  (47) 


Eeie c  is  the  unit  energy  consumed  by  the  electronics  to 
process  one  bit  of  message,  e,fs  and  emp  are  the  amplifier 
factor  for  free-space  and  multi-path  models,  respectively, 
and  do  is  the  reference  distance  to  determine  which 
model  to  use.  The  values  of  these  communication  energy 
parameters  are  set  as  in  Table  II. 


Substituting  an  appropriate  solution  of  dn  into  (42) 
would  give  us  the  probability  of  error  within  required 
accuracy.  For  example,  if  we  choose  (33), 


p{e)  «  o?Q{- 


2(72 


L) +  !>"[<?( 


71= 3 


Win  Win— l  > 
2 


+Q(^+t  m")] 

*•&' 71 


(43) 


V.  Application  Examples 
A.  Latency  Estimation 

Suppose  it  takes  Trx  for  a  sensor  node  to  receive 
1  bit  of  message  and  Ttx  to  transmit.  Considering  the 
transmission  range  in  sensor  networks  is  usually  short 
compared  to  the  light  speed,  the  propagation  time  Tpr 


TABLE  II 

Energy  Consumpton  Parameters 


Name 

Value 

do 

86.2m 

Eelec 

50  nJ/bit 

Ed  a 

5  nJ /bit 

tfs 

lOpJ jbit/mz 

€mp 

0.0013pJ/bit/m‘> 

Let  sn  denote  the  single-hop  distance  from  the  (n  — 
l)th-hop  to  the  nth-hop.  Obviously,  sn  <  R.  In  our  ex¬ 
perimental  setting,  R  =  30 m  <  do  so  that  the  free  space 
model  is  always  used.  This  agrees  well  with  most  ap¬ 
plications,  in  which  multi-hop  short-range  transmission 
is  preferred  to  avoid  the  exponential  increase  in  energy 
consumption  for  long-range  transmission.  Naturally,  the 
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end-to-end  energy  consumption  for  sending  1  bits  over 
distance  r  is  given  by 

h 

Etotal{l,r)  =  +  Erx(l)} 

1 

n 

=  fy'XEelec  +  efssn  +  Eeiec},  (48) 
1 

where  h  is  the  decision  result  for  given  r.  On  the  average, 

n 

Etotai(l,r)  =  2nlEeiec  +  efsYjE[sl]  (49) 

1 


Fig.  8.  The  relationship  between  r„,rn- i  and  s„. 

The  relationship  between  rn,rn- 1  and  sn  is  depicted 


in  Fig.8. 

t  = 

cos  A  = 

sn  cos  A 

1  ^,2  fJl 

sn  '  rn  '  n— 1 

IT 

2  S-nXn 

II 

CN  g 
CO 

rn-l  ~rl  +  2trn 

(50) 

VI.  Conclusion 

To  predict  the  number  of  hops  H  needed  to  reach  a 
given  distance  r  in  randomly  deployed  sensor  networks, 
we  proposed  a  ML  decision  based  on  the  joint  pdf  of 
( H,n ),  which  was  also  derived  in  this  paper.  Since 
the  solution  is  not  closed-form,  we  also  proposed  an 
attenuated  Gaussian  approximation  for  the  joint  pdf. 
We  show  that  the  approximation  visibly  simplifies  the 
decision  process  and  the  error  analysis.  The  latency 
and  energy  consumption  estimation  are  also  included  as 
application  examples. 
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For  large  n,  rn  sn  and  rn_i  s„,  therefore,  t  — »  A. 
According  to  Property  2,  A  can  be  treated  as  a  constant. 

E[s2n)  «  E[rl_1]-E[rl}  +  2AE[rn] 

=  al_1-al  +  2Amn  (51) 
(v  Property  1)  «  2A mn  (52) 

Substitute  (52)  into  (49), 

h 

total{ ^>f*)  ~  2 TLlEeiec-\-  2€/5Ay>w  (53) 
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Abstract — We  model  the  end-to-end  distance  for  given  hops 
in  Wireless  Sensor  Networks  in  this  paper.  We  derive  that  the 
single-hop  distance  follows  the  distribution  2r/R2,  where  R  is 
the  transmission  range.  The  end-to-end  distance  shows  beta 
distribution  for  two  hops,  and  approaches  Gaussian  distribution 
when  the  number  of  hops  is  beyond  three.  As  an  application 
example,  we  propose  Statistical  Distance  Estimation,  which  shows 
less  distance  error  than  Hop-TERRAIN  and  APS  (Ad  hoc 
Positioning  System).  Our  results  are  also  applicable  to  other 
applications  for  Wireless  Sensor  Networks. 

I.  Introduction  and  Motivation 

In  Wireless  Sensor  Networks  (WSN),  knowledge  of  node 
location  is  often  required  in  many  applications.  The  examples 
include  events  report,  target  tracking,  geographical  routing, 
and  coverage  evaluation.  Generally,  the  distances  from  a  node 
with  unknown  location  to  several  anchor  nodes  are  estimated, 
and  then  a  multilateration  is  applied  to  estimate  the  node 
location.  Distance  is  often  estimated  based  on  received  signal 
strength,  time  of  arrival  (TOA),  time  difference  of  arrival 
(TDOA)  or  angle  of  arrival  [1].  The  angle-of-arrival  based 
ranging  requires  directive  antennas  or  arrays,  which  is  not 
suitable  for  most  microsensors.  Similarly,  measuring  time  of 
flight  requires  timing  device  with  satisfactory  resolution  like 
in  GPS.  Although  TDOA  needs  much  less  resolution,  it  often 
requires  extra  acoustic  or  ultrasound  emission,  which  comes 
with  higher  price,  larger  size  and  more  energy  consumption,  all 
seeming  impractical  for  microsensors.  Thus,  most  technically 
available  ranging  is  based  on  received  signal  strength;  in  fact, 
RSSI  (Received  Signal  Strength  Indication)  is  widely  used  in 
wireless  communications  to  provide  distance  estimation. 

The  underlying  observation  is  that  the  average  large-scale 
path  loss  can  be  expressed  as  a  function  of  distance  by  using 
a  path  loss  exponent,  n  [2], 

PL(d)  =  Pi(do)(^-)n  (1) 

do 

where  n  is  the  path  loss  exponent,  which  indicates  the  rate  at 
which  the  path  loss  increases  with  distance,  do  is  the  close- 
in  reference  distance,  which  is  determined  from  measurement 
close  to  the  transmitter,  and  d  is  the  distance  from  the  source 
to  the  receiving  point.  Measurements  have  also  shown  that  at 
any  value  of  d,  the  path  loss  PL{d)  at  a  particular  location 
is  random  and  distributed  log-normally  (normal  in  dB)  about 


the  mean  distance-dependent  value. 

PL(d)[dB\  =  PL(d)\dB]  +  XCT,  (2) 

where  X„  is  a  zero-mean  Gaussian  distributed  random  variable 
(in  dB)  with  standard  deviation  a  (also  in  dB).  The  log-normal 
shadowing  is  the  main  source  of  distance  error  for  received- 
signal-strength-based  ranging  methods.  The  values  of  n  and 
a  are  often  estimated  empirically,  for  example,  n  could  vary 
from  2  to  10  for  different  environments,  and  typical  value  of 
<7  in  urban  area  is  around  10  dBs. 

Due  to  the  log-normal  shadowing,  the  RSS-based  ranging 
could  be  very  rough,  especially  indoors.  For  example,  the 
median  localization  error  of  commodity  802.11  technology 
is  10/f  [3],  such  accuracy  may  be  achieved  by  alternative 
techniques,  for  example,  exploiting  the  density  of  sensor 
deployment  to  estimate  distance  between  nodes.  Since  the 
sensor  nodes  are  over-densely  deployed,  the  distance  between 
the  nodes  are  short  and  the  variance  of  such  distance  is  also 
small.  Therefore,  it  is  quite  promising  to  use  the  end-to-end 
distance  to  obtain  distance  estimation  [4],  [5]. 

For  example,  both  APS  [4]  and  Hop-TERRAIN  [6]  find  the 
number  of  hops  from  a  node  to  each  of  the  anchors  and  then 
multiplie  this  hop  count  by  a  shared  metric  (average  single¬ 
hop  distance)  to  estimate  the  range  between  the  node  and 
each  anchor.  The  known  positions  of  anchor  nodes  and  these 
computed  ranges  are  then  used  to  perform  a  triangulation  to 
obtain  estimated  node  positions.  A  further  refinement  phase  is 
proposed  in  [6],  which  uses  least  squares  on  local  computation. 
However,  as  we  show  later,  the  distance  does  not  increase  lin¬ 
early  with  the  number  of  hops.  Therefore,  a  better  knowledge 
about  the  distribution  of  end-to-end  distance  for  given  number 
of  hops  could  cast  new  light  on  distance  estimation. 

Geometrical  probabilistic  study  on  randomly  distributed 
nodes  may  date  back  to  centuries  ago.  More  recent  studies  in¬ 
clude  [7],  which  inspected  a  stochastic  modeling  of  broadcast 
percolation  in  one-dimension  and  obtain  the  pdf  of  the  hop- 
distance  based  on  Poisson  node  distribution  [8],  Vural  and 
Ekici  [9]  reexamined  this  problem  for  WSN,  and  proposed 
Gaussian  approximations  for  multi-hop  end-to-end  distance. 
In  these  studies,  the  following  equation  is  widely  cited  as  the 


pdf  for  single-hop  distance  on  a  line  [9], 


B.  Chi-Square  Test 


f{<)  = 


\e~x< 

1  _  e-HR-ri-1)  ’ 


(3) 


where  A  is  the  node  density  and  R  is  the  transmission  range. 
However,  these  studies  are  based  on  farthest  delivery,  that 
is,  only  the  farthest  node  in  the  desired  direction  within  the 
transmission  range  would  relay  the  beacon  packets.  In  [7],  the 
locations  of  nodes  are  known  so  that  the  farthest  node  could 
be  chosen  as  the  next  hop.  When  we  plan  to  exploit  node 
distribution  to  estimate  distance  between  nodes,  we  cannot 
guarantee  the  beacon  packets  are  relayed  in  such  a  fashion, 
because  it  is  impossible  for  any  nodes  to  have  such  location 
information  a  priori.  In  fact,  routing  does  not  necessarily 
choose  the  farthest  node  for  reasons  such  as  energy  efficiency, 
minimizing  interference,  robustness  and  so  forth.  A  new  study 
must  be  carried  out  in  the  background  of  distance  estimation. 

The  rest  of  this  paper  is  organized  as  follow.  Section  II  pro¬ 
vides  some  preliminaries  for  statistical  analysis.  We  model  the 
single-hop  distance  and  show  that  the  derivation  for  higher-hop 
end-to-end  distance  is  beyond  practical  complexity  in  Section 

III.  Computer  simulations  and  analysis  are  presented  in  Section 

IV.  In  Section  V,  based  on  the  knowledge  of  hop-distance 
distribution,  we  propose  Statistical  Distance  Estimation  (SDE), 
independent  of  ranging  techniques.  Section  VI  concludes  this 
paper. 


II.  Preliminaries 


In  this  section,  we  provide  some  preliminaries  on  statistical 
methods  [10]. 

A.  Skewness  and  Kurtosis 

Skewness  is  a  measure  of  symmetry,  or  more  precisely,  the 
lack  of  symmetry.  A  distribution,  or  sample  set,  is  symmetric 
if  it  looks  the  same  to  the  left  and  right  of  the  center  point. 
Definition  1:  [10]  For  a  given  sample  set  X, 

m3  =  E  {X-Xf/n,  (4) 

m2  =  E  (X-X)2/n,  (5) 

where  X  is  the  sample  mean  of  X,  and  n  is  the  size  of  X. 
Then  a  sample  estimate  of  skewness  coefficient  is  given  by 

ms  ra\ 

9i  =  — r  •  (6) 

mf 

Skewness  is  zero  for  a  symmetric  distribution.  Positive  skew¬ 
ness  indicates  right  skewness  and  negative  indicates  left. 

Kurtosis  is  a  measure  of  whether  the  data  are  peaked  or  flat 
relative  to  a  normal  distribution. 

Definition  2:  [10]  A  sample  estimate  of  kurtosis  for  a 
sample  set  X  is  given  by 


g2  =  m4/m|  -  3,  (7) 

where  m4  =  E(X  —  X)4/n  is  the  fourth-order  moment  of  X 
about  its  mean. 

Skewness  and  kurtosis  is  useful  in  determining  whether  a 
sample  set  is  normal.  Note  that  the  skewness  and  kurtosis  of 
a  normal  distribution  are  both  zero;  significant  skewness  and 
kurtosis  clearly  indicate  that  data  are  not  normal. 


Chi-square  test  is  widely  used  to  determine  the  goodness  of 
fit  of  a  distribution  to  a  set  of  experimental  data.  It  works  as 
follows: 


■  1.  Partition  the  sample  space  into  the  union  of  K  disjoint 
intervals. 

•  2.  Compute  the  probability  6*  that  an  outcome  falls  in  the 
kth  interval  under  the  postulated  distribution.  The  mk  = 
nbk  is  the  expected  number  of  outcomes  that  fall  in  the 
fcth  interval  in  n  repetitions  of  the  experiment. 

•  3.  The  chi-square  statistic  is  defined  as  the  weighted 
difference  between  the  observed  number  of  outcomes, 
Nk,  that  fall  in  the  fcth  interval,  and  the  expected  number 


mk. 


D 2  =  Sf=1 


(ATfc  -  mk )2 


mk 


(8) 


•  4.  The  hypothesis  is  rejected  if  D2  >  ta,  where  ta 
is  a  threshold  determined  by  a  given  significance  level. 
Otherwise,  the  fit  is  considered  good. 


III.  Modeling  End-to-end  Distance  for  Given 
Number  of  Hops 

A.  Problem  Formulation 

We  assume  a  general  beacon  scenario,  in  which  anchors 
sends  out  beacon  packets  informing  other  nodes  about 
their  locations.  These  beacon  packets  are  also  relayed  so 
that  nodes  outside  the  anchors’  transmission  range  could 
also  have  knowledge  about  their  locations.  Nonetheless, 
clarifications  about  several  terms  are  necessary,  because 
they  have  been  used  in  a  wide  variety  of  senses. 

Firstly,  our  study  on  end-to-end  distance  for  given  number 
of  hops  is  based  on  local  coordinate  system,  which 
could  be  translated  into  a  global  coordinate  system  if 
enough  nodes  in  the  local  coordinate  system  have  known 
global  coordinates.  In  previous  research,  anchors  refer 
to  beacons,  whose  locations  are  known  and  broadcast  to 
other  nodes.  However,  in  our  study,  an  anchor  is  simply 
a  specific  node  used  in  establishing  the  local  coordinate 
system.  An  anchor  could  have  global  coordinates  or  not, 
which  is  of  no  interest  to  our  study.  Therefore,  our 
study  is  applicable  to  both  anchor-based  and  anchor-free 
approaches. 

Secondly,  we  assume  the  beacon  packets  are  distributed 
in  an  ad  hoc  fashion.  Although  better  routing,  such  as 
geographic  routing,  are  proposed  for  WSN,  they  are  not 
suitable  for  relaying  beacon  packets,  because  during  this 
phase,  most  nodes  have  no  knowledge  about  locations 
of  their  own  and  neighbors’.  Under  such  circumstances, 
we  have  to  assume  the  beacon  packets  are  simply  flooded 
throughout  the  sensor  network,  except  that  nodes  can  only 
relay  the  beacon  packets  incoming  with  least  number  of 
hops  and  discard  those  via  more  hops. 

Suppose  the  sensor  nodes  are  placed  on  a  plane  at  random 
at  an  average  density  of  A  nodes  per  square  meters.  Let 
N  (A)  be  the  number  of  nodes  in  area  A,  it  can  be  shown 
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Fig.  1.  Uniform  node  distribution. 

that  N(  A)  is  a  two-dimensional  Poisson  point  process 
with  density  A.  One  property  of  the  Poisson  process  is  that 
if  the  number  of  nodes  occurring  in  the  area  A  is  N,  then 
the  individual  outcomes  are  distributed  independently  and 
uniformly  in  the  area  A.  That  is,  if  N  nodes  are  placed 
at  random  in  the  area  A,  then  the  probability  of  a  specific 
node  in  the  subarea  B  is  B/ A,  given  B  is  included  by 

A. 

Assume  the  area  A  is  large  enough  and  none  of  the  anchor 
nodes  is  near  the  border.  Without  loss  of  generality,  we 
center  the  polar  coordinates  at  an  arbitrary  anchor  node 
(Fig.l).  This  node  could  communicate  directly  with  any 
other  nodes  within  the  transmission  range,  say  R.  The 
problem  of  interest  is  to  find  the  distance  from  a  specific 
node  to  the  anchor  given  this  node  is  within  i  hops  from 
the  anchor.  The  definitions  of  variables  we  are  working 
with  are  listed  in  Table  I.  Note  that  the  event  Pop,  can 


TABLE  I 

Definition  of  Variables 

Variable 

Definition 

?i  =  fri,  0i) 

the  polar  coordinates  of  the  i-hop  node 

Si 

the  distance  from  the  anchor  to  the  i-hop 
node 

U 

the  distance  from  the  (i-  l)-hop  node  to  the 
i-hop  node 

Hop  i 

the  event  “the  specific  node  is  within  i  hop, 
but  beyond  i  —  1  hops  from  the  anchor.” 

also  be  described  as  “the  minimum  number  of  hops  from 
the  anchor  to  the  specific  node  is  i”. 

B.  Single-Hop  Case 

Consider  the  first  hop  case,  the  conditional  cdf  can  be 
expressed  by 

P[si  <  s\Hopi]  =  P[si  <  s|ri  <  R ] 

__  P[£i  <  s] 

Pin  <  R] 


Note  that  since  the  anchor  node  is  placed  at  the  origin, 
the  single-hop  distance  tfM  equals  n.  The  conditional 
pdf  is  the  derivative  of  (9). 

2s 

/(ii|Hopi)(s)  =  (10) 

The  conditional  expected  value  and  variance  can  be  easily 
computed  by 


fR  2s 

E[si\Hopi]  =  J  s— ^ds 


V  AR[si\H  opi]  = 


3 

£[(st)2] 


fR  2  2 s  J  4P2 
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(11)  and  (12)  show  that  the  expected  value  and  variance 
of  single-hop  distance  is  solely  determined  by  the  trans¬ 
mission  range  R  and  irrelevant  to  the  node  distribution 
density  A.  This  is  due  to  the  uniform  node  distribution; 
no  matter  how  large  the  density  could  be,  it  would  not 
give  any  bias  to  E[s\\Hop\]  and  V AR[si\Hopi}. 

C.  Two-Hop  Case 


r-7 

Fig.  2.  Two  hops. 


Consider  the  two-hop  case  shown  in  Fig.  2.  The  distribu¬ 
tion  of  si, which  is  equal  to  fj,  is  given  in  (9).  Conditional 
on  the  value  of  sj,  the  cdf  for  t2  is 

P(^2  <  t\s\\Hop2)  —  03) 

where  B  is  the  area  of  the  region  inside  the  circle  of 
center  ri  but  outside  the  circle  of  center  fo.  B  is  equal 
to 

tt( t2f  -  (fi)2(0i  -  |  sin  20! )  -  (f2)2(0 2  -  \  sin  2 02), 

(14) 
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where 


•h  =  cos_1(1-^)’  O5) 

fa  =  COS_1(^-).  (16) 

The  conditional  pdf  of  t2  is  obtained  by  taking  the 
derivative  of  (13). 

fti\si\Hop2{r)  ~  (^) 

By  taking  expected  value  of  (17), 

fti\Hop2(t)  =  fsi(s)jt^R 2ds’  (18) 

S2  is  determined  by 

S2  =  y/ (ti)2  +  {t2)2  -  2tih  cos  <f>,  (19) 

where  <j>  is  the  angle  between  ij  and  t2  and  uniformly 
distributed  in  [-fa,  fa)-  Although  it  is  possible  to  derive 
the  pdf  of  S2  from  (19),  it  is  awkward  to  evaluate 
explicitly.  Thus,  for  the  end-to-end  distance  for  two  and 
more  hops,  we  will  postulate  their  distribution  from  the 
collected  simulation  data  in  the  next  section. 

IV.  Simulations  and  analysis 


(e)  (f) 

Fig.  3.  The  histogram  vs.  postulated  distribution  for  end-to-end  distances  for 
given  number  of  hops,  (a)  One-hop.  (b)  Two-hop.  (c)  Three-hop.  (d)  Four-hop. 
(e)  Five-hop.  (f)  Six-hop. 


circular  region  of  radius  of  300  meters.  For  convenience, 
polar  coordinates  were  used.  The  anchor  node  was  placed 
at  (0, 0).  The  transmission  range  was  set  as  R  meters.  For 
each  setting  of  (N,  R),  we  ran  300  simulations,  in  each 
of  which  all  nodes  are  re-deployed  from  the  beginning. 

A.  Single-Hop  Distance 

We  plot  (10)  and  the  histogram  of  single-hop  distance 
collected  from  simulations  together  in  Fig.  3  (a),  which 
clearly  shows  that  (10)  fits  the  experimental  data  very 
well.  Furthermore,  a  chi-square  test  was  carried  out  to 
determine  the  goodness  of  fit  of  (10)  to  the  experimental 
data. 

TABLE  II 

Chi-Square  Test  for  Single-Hop  Distance  Distribution. 


Interval 

Observed 

Expected 

(O  -  EY/E 

i 

539 

555.37 

0.48233 

2 

543 

555.37 

0.27538 

3 

546 

555.37 

0.15798 

4 

560 

555.37 

0.038655 

5 

583 

555.37 

1.3749 

6 

507 

555.37 

4.2122 

7 

541 

555.37 

0.37165 

8 

571 

555.37 

0.44007 

9 

562 

555.37 

0.079229 

10 

538 

555.37 

0.54307 

11 

583 

555.37 

1.3749 

12 

564 

555.37 

0.13421 

13 

593 

555.37 

2.5501 

14 

555 

555.37 

0.00024208 

15 

577 

555.37 

0.84269 

16 

563 

555.37 

0.10492 

17 

566 

555.37 

0.20359 

18 

537 

555.37 

0.60741 

19 

549 

555.37 

0.072987 

20 

499 

555.37 

5.7209 

21 

577 

555.37 

0.84269 

22 

535 

555.37 

0.7469 

23 

552 

555.37 

0.020409 

24 

550 

555.37 

0.05186 

25 

611 

555.37 

5.573 

26 

552 

555.37 

0.020409 

27 

566 

555.37 

0.20359 

28 

541 

555.37 

0.37165 

29 

570 

555.37 

0.38557 

30 

531 

555.37 

1.0691 

|  Chi-Square  Value  = 

28.8728 

The  threshold  for  30  —  1  =  29  degrees  of  freedom  at 
a  0.005  significance  level  is  52.34.  Compared  to  this, 
D2  =  28.8728  is  well  within  the  threshold.  Thus,  we 
establish  that  the  data  is  in  good  agreement  with  (10). 


B.  Two-Hop  End-to-end  Distance 
Since  there  is  no  close-form  formula  for  the  conditional 
pdf  of  end-to-end  distance  for  two  and  more  hops,  we 
have  to  find  a  fit  for  it.  We  postulate  the  following  pdf 
for  the  conditional  pdf  of  two-hop  end-to-end  distance 
according  to  the  experimental  data  plotted  in  Fig.  3  (b). 
The  characteristic  curve  in  Fig.  3  (b)  clearly  shows  a  Beta 
distribution  shape.  The  general  pdf  of  Beta  distribution  is 


fx{x)  = 


(x  —  a)p  1(b  —  x)q  1 
B(p,  q)(b  -  a)p+9-1 


All  the  simulation  data  are  collected  from  such  a  scenario 
that  N  sensor  nodes  were  uniformly  distributed  in  a 
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(20) 


where  p  and  q  are  the  shape  parameters,  a  and  b  are  the 
lower  and  upper  bounds,  respectively,  of  the  distribution, 
and  B(p,  q)  is  the  beta  function.  The  beta  function  has 
the  formula 

B(p,q)=  [  tp-1(l  -  t)q~1dt.  (21) 
Jo 


V.  An  Application  Example:  Statistical 
Distance  Estimation 

The  knowledge  about  the  end-to-end  distance  for  given 
number  of  hops  can  used  widely  in  applications  for 
WSN.  For  example,  we  here  propose  Statistical  Distance 
Estimation  (SDE). 


The  bounds  a  and  b  can  be  easily  determined  as  a  =  0 
and  b  =  2  R.  Since  the  maximum  of  (20)  occurs  at 
which  is  at  in  Fig.  3  (b),  p  =  3  and  q  =  1  would  be 
a  good  guess. 

Another  noteworthy  fact  is  (20)  is  valid  in  [0, 2R]  while 
we  only  consider  x  €  [i?,  2 R]  for  the  conditional  two-hop 
end-to-end  distance  distribution.  Therefore,  (20)  should 
be  modified  for  the  given  condition  x  €  [iJ,  2 R],  that  is, 

fx  (x) 

fx\xe[R,2R]  =  Fx(2R)  -  FX(R)'  (22) 
where  Fx(x)  is  the  cdf  of  fx(%)-  Thus, 

(2 R  _  s)s3 

f s2\Hop2(s)  —  2  4)(2ii)5  R  ~  S  <  ^ 


where  B(p,q)  is  the  beta  function  and  C  = 
Fx(2R)  -  Fx(R).  Note  that  C  is  simply  a  constant 
making 


[2R  (2 R-s)s3 

Jr  B(2,4)(2fl)8 


(24) 


When  R  =  30,  (24)  gives  us 


r60  (60  —  s)s3 

J30  B{ 2,4)(60)5 
0.8125  (25) 


C.  Three-And-More-Hop  End-to-end  Distance 

When  the  number  of  hops  increases  beyond  three,  the 
end-to-end  distance  distribution  approaches  Gaussian 
(See  Fig.  3  (d)(e)(f)).  For  a  more  formal  analysis  about 
its  Gaussianity,  we  list  their  skewness  and  kurtosis  in 
Table  III.  Note  that  both  skewness  and  kurtosis  are 
virtually  zero  within  tolerance,  we  postulate  Gaussian 
distribution  for  three-and-more-hop  end-to-end  distance. 
The  mean  and  std  can  be  estimated  from  the  experimental 
data  (see  Table  III).  The  postulated  distribution  and 
histogram  are  drawn  together  in  Fig.  3  (d)(e)(f),  which 
clearly  shows  a  close  match  for  each  case. 

TABLE  III 

Means  and  Stds  for  Three-And-More-Hop  End-to-end 
Distances 


Number  of  Hops 

Mean 

Std 

Skewness 

Kurtosis 

3 

72.01 

8.2129 

-0.10761 

-1.0332 

4 

99.45 

8.391 

-0.079383 

-0.97857 

5 

127.14 

8.5323 

-0.064453 

-0.93104 

6 

154.96 

8.6147 

-0.053416 

-0.9004 

A.  Protocol  Description 

SDE  is  designed  for  randomly  over-densely  deployed 
WSN  so  that  a  smaller  transmission  range  can  be  used 
without  loss  of  connectivity.  SDE  is  used  to  obtain  rela¬ 
tive  rough  distance  between  nodes  in  order  to  establish  a 
local  coordinate  system.  SDE  starts  with  a  core  of  anchors 
with  assigned  coordinates.  These  anchors  broadcast  their 
coordinates  throughout  the  sensor  network.  Other  nodes 
keep  and  relay  a  minimum-hop  beacon  from  each  anchor. 
A  node  can  estimate  its  distance  from  an  anchor  based 
on  the  minimum  number  of  hops  it  takes  the  beacons 
to  travel  from  the  anchor.  Instead  of  using  the  product 
of  an  average  single-hop  distance  and  the  number  of 
hops  in  Hop-TERRAIN,  SDE  uses  the  mean  of  end- 
to-end  distance  for  minimum  number  of  hops  as  the 
estimator.  Once  a  node’s  distances  from  three  and  more 
non-collinear  anchors  are  estimated,  multilateration  can 
be  used  to  determine  its  location. 

In  SDE,  sensor  nodes  do  not  need  to  have  the  full 
knowledge  on  the  end-to-end  distance  distribution.  In 
fact,  a  table  of  the  mean  distance  for  each  possible 
number  of  hops  is  sufficient.  This  table  can  be  compiled 
empirically  from  simulations  for  different  node  densities 
and  transmission  ranges.  According  the  minimum  num¬ 
ber  of  hops  from  the  anchor,  a  node  can  look  up  the 
corresponding  mean  in  the  table. 


B.  Error  Analysis 

For  the  single-hop  distance,  we  have  derived  the  theoret¬ 
ical  distribution  given  by  (10).  From  (12),  we  obtain 


MSE(Hopi)  =  y/VAR[Sl\Hopl]  = 


R 

3\/2 


(26) 


Note  that  MSE  increases  linearly  with  the  transmission 
range  R.  When  R  —  30 m,  (26)  gives  us  7.0711,  which 
agrees  well  with  the  collected  MSE  7.1246  from  simula¬ 
tions.  Other  MSEs  collected  from  simulations  are  also 
listed  in  Table  IV,  which  clearly  shows  the  distance 
accuracy,  indicated  by  MSE,  decreases  monotonously 
with  the  number  of  hops.  Consider  a  specific  node,  if 
we  decrease  R  in  the  hope  of  decreasing  MSE,  it  would 
take  more  hops  for  the  beacon  to  reach  this  node,  which 
could  counteract  the  MSE  reduction  due  to  reduced  R.  As 
rule  of  thumb,  we  found  the  best  number  of  hops  is  two, 
that  is,  it  would  be  advisable  to  choose  a  transmission 
range  to  keep  all  nodes  within  two  hops  from  the  anchors. 
Considering  only  the  nodes  within  two  hops  from  the 
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anchors,  the  average  MSE  is 

7.12  x  5428  +  7.33  x  11376 
5428+11376 

=  7.27,  (27) 

which  is  approximately  0.247?..  Compared  to  7?/3  dis¬ 
tance  error  provided  by  [4],  [6],  our  statistical  approach 
achieved  a  lesser  error,  which  is  within  a  quarter  of  the 
transmission  range. 


TABLE  IV 

MSE  from  Simulations  (R=30m) 


Number  of  Hops 

Sample  Size 

i 

7.12 

5428 

2 

7.33 

11376 

3 

8.75 

4 

9.77 

5 

■  I'gf.-ft 

33804 

6 

10.89 

40770 

VI.  Conclusion 

In  this  paper,  we  study  the  modeling  of  the  end-to- 
end  distance  for  given  number  of  hops  in  WSN.  The 
experiments  showed  that  the  distance  does  not  increase 
linearly  with  the  number  of  hops.  Therefore,  the  distance 
should  be  analyzed  for  each  number  of  hops.  We  de¬ 
rived  the  distribution  for  single-hop  distance  and  also 
showed  that  the  complexity  of  derivation  for  multiple-hop 
distance  is  beyond  practical  interest.  Thus,  we  postulate 
gamma  distribution  for  two-hop  end-to-end  distance  and 
Gaussian  distribution  for  three-and-more-hop  end-to-end 
distance.  Computer  simulations  showed  our  postulated 
distributions  agree  well  with  the  histograms. 

We  also  propose  Statistical  Distance  Estimation,  in  which 
statistically  exploiting  the  knowledge  of  hop-distance 
distribution  reduces  the  distance  error  from  R/3  to  7?/ 4. 
Such  fundamental  knowledge  about  end-to-end  distance 
distribution  is  applicable  to  other  applications  for  WSN, 
such  planning  and/or  optimization  in  deployment  and 
resource  management. 
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